Data Analysis and Techniques

for predicting the speed for pet adoption

In this notebook, we will see different regression methods and how well do they predict on our dataset, together with methods like:

  • Pretprocessing
  • Handling missing values
  • Feature Selection
  • Classification
  • Class balancing methods

The dataset used is from Kaggle, named PetFinder.my Adoption Prediction.

Describing the data fields:

  • PetID - Unique hash ID of pet profile .
  • AdoptionSpeed - Categorical speed of adoption. Lower is faster. This is the value to predict. See below section for more info
  • Type - Type of animal (1 = Dog, 2 = Cat)
  • Name - Name of pet (Empty if not named)
  • Age - Age of pet when listed, in months
  • Breed1 - Primary breed of pet (Refer to BreedLabels dictionary)
  • Breed2 - Secondary breed of pet, if pet is of mixed breed (Refer to BreedLabels dictionary)
  • Gender - Gender of pet (1 = Male, 2 = Female, 3 = Mixed, if profile represents group of pets)
  • Color1 - Color 1 of pet (Refer to ColorLabels dictionary)
  • Color2 - Color 2 of pet (Refer to ColorLabels dictionary)
  • Color3 - Color 3 of pet (Refer to ColorLabels dictionary)
  • MaturitySize - Size at maturity (1 = Small, 2 = Medium, 3 = Large, 4 = Extra Large, 0 = Not Specified)
  • FurLength - Fur length (1 = Short, 2 = Medium, 3 = Long, 0 = Not Specified)
  • Vaccinated - Pet has been vaccinated (1 = Yes, 2 = No, 3 = Not Sure)
  • Dewormed - Pet has been dewormed (1 = Yes, 2 = No, 3 = Not Sure)
  • Sterilized - Pet has been spayed / neutered (1 = Yes, 2 = No, 3 = Not Sure)
  • Health - Health Condition (1 = Healthy, 2 = Minor Injury, 3 = Serious Injury, 0 = Not Specified)
  • Quantity - Number of pets represented in profile
  • Fee - Adoption fee (0 = Free)
  • State - State location in Malaysia (Refer to StateLabels dictionary)
  • RescuerID - Unique hash ID of rescuer
  • VideoAmt - Total uploaded videos for this pet
  • PhotoAmt - Total uploaded photos for this pet)
  • Description - Profile write-up for this pet. The primary language used is English, with some in Malay or Chinese

AdoptionSpeed

Contestants are required to predict this value. The value is determined by how quickly, if at all, a pet is adopted. The values are determined in the following way:

  • 0 - Pet was adopted on the same day as it was listed.
  • 1 - Pet was adopted between 1 and 7 days (1st week) after being listed.
  • 2 - Pet was adopted between 8 and 30 days (1st month) after being listed.
  • 3 - Pet was adopted between 31 and 90 days (2nd & 3rd month) after being listed.
  • 4 - No adoption after 100 days of being listed. (There are no pets in this dataset that waited between 90 and 100 days).

Defining libraries and utility methods

Here we are defining the libraries and some functions that we will use later on our data:

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import csv
import seaborn as sns
import plotly
import plotly.offline
import plotly.graph_objs as go
import matplotlib.pyplot as plt
import os

from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.linear_model import BayesianRidge, RidgeCV, LassoCV, LinearRegression, HuberRegressor, ElasticNetCV, Lasso, Ridge, ElasticNet
from sklearn.model_selection import cross_val_score
from sklearn.feature_selection import SelectFromModel
from sklearn.metrics import mean_squared_error, r2_score, make_scorer
from scipy import stats
from scipy.stats import skew
from sklearn.preprocessing import RobustScaler
sns.set()
## Models
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import GaussianNB 
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis

## Model evaluators
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.model_selection import RandomizedSearchCV, GridSearchCV
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.metrics import precision_score, recall_score, f1_score
from sklearn.metrics import plot_roc_curve

Exploratory Data Analysis

This part of the analysis provides pretprocessing (cleaning the noisy data, seeing if there are any outliers, encoding the attributes if it is needed) and a better understanding of the dataset through visualizations. This is very important because it would help us choose the right technique for data mining and the right estimator for the accuracy of the algorithms.

Pretprocessing

Importing the dataset

We can see that the dataset has 14994 rows, while we will test 2998 rows, which shows us is 80-20 split, but we will do this later in the next step with sklearn train_test_split.

We also see that there are many null values in the Names attribute, and that is the reason why we are going to add a new class named Undefined for each of those with missing names.After cleaning the dataset, we can see that we still have one column with null values: Speed, that we are going to drop the Speed column in the process of cleaning data, since we already have a column Addoption speed.

After cleaning the train and test dataset, we will drop the Description column as irrelevant for this prediction because we are not going to use natural language processing algorithms.

In [2]:
#importing the dataset
df_train = pd.read_csv("data/train.csv")

print(df_train.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14993 entries, 0 to 14992
Data columns (total 24 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   Type           14993 non-null  int64 
 1   Name           13736 non-null  object
 2   Age            14993 non-null  int64 
 3   Breed1         14993 non-null  int64 
 4   Breed2         14993 non-null  int64 
 5   Gender         14993 non-null  int64 
 6   Color1         14993 non-null  int64 
 7   Color2         14993 non-null  int64 
 8   Color3         14993 non-null  int64 
 9   MaturitySize   14993 non-null  int64 
 10  FurLength      14993 non-null  int64 
 11  Vaccinated     14993 non-null  int64 
 12  Dewormed       14993 non-null  int64 
 13  Sterilized     14993 non-null  int64 
 14  Health         14993 non-null  int64 
 15  Quantity       14993 non-null  int64 
 16  Fee            14993 non-null  int64 
 17  State          14993 non-null  int64 
 18  RescuerID      14993 non-null  object
 19  VideoAmt       14993 non-null  int64 
 20  Description    14981 non-null  object
 21  PetID          14993 non-null  object
 22  PhotoAmt       14993 non-null  int64 
 23  AdoptionSpeed  14993 non-null  int64 
dtypes: int64(20), object(4)
memory usage: 2.7+ MB
None

Splitting the Dataset in Train and Test dataset

In [3]:
# Everything except AdoptionSPeed variable
X = df_train.drop("AdoptionSpeed", axis=1)
# Everything except Name, RescueID,Description,PetID,State variables
X=df_train.drop(columns = ['Name','RescuerID','Description','PetID','State'])

# Target variable
y = df_train["AdoptionSpeed"]
# Independent variables (no target column)
print(X.head())

print(y.head())
np.random.seed(42)
train_df,test_df=train_test_split(df_train,train_size=11900,test_size=2998)
   Type  Age  Breed1  Breed2  Gender  Color1  Color2  Color3  MaturitySize  \
0     2    3     299       0       1       1       7       0             1   
1     2    1     265       0       1       1       2       0             2   
2     1    1     307       0       1       2       7       0             2   
3     1    4     307       0       2       1       2       0             2   
4     1    1     307       0       1       1       0       0             2   

   FurLength  Vaccinated  Dewormed  Sterilized  Health  Quantity  Fee  \
0          1           2         2           2       1         1  100   
1          2           3         3           3       1         1    0   
2          2           1         1           2       1         1    0   
3          1           1         1           2       1         1  150   
4          1           2         2           2       1         1    0   

   VideoAmt  PhotoAmt  AdoptionSpeed  
0         0         1              2  
1         0         2              0  
2         0         7              3  
3         0         8              2  
4         0         3              2  
0    2
1    0
2    3
3    2
4    2
Name: AdoptionSpeed, dtype: int64
In [4]:
test_df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2998 entries, 13408 to 5513
Data columns (total 24 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   Type           2998 non-null   int64 
 1   Name           2741 non-null   object
 2   Age            2998 non-null   int64 
 3   Breed1         2998 non-null   int64 
 4   Breed2         2998 non-null   int64 
 5   Gender         2998 non-null   int64 
 6   Color1         2998 non-null   int64 
 7   Color2         2998 non-null   int64 
 8   Color3         2998 non-null   int64 
 9   MaturitySize   2998 non-null   int64 
 10  FurLength      2998 non-null   int64 
 11  Vaccinated     2998 non-null   int64 
 12  Dewormed       2998 non-null   int64 
 13  Sterilized     2998 non-null   int64 
 14  Health         2998 non-null   int64 
 15  Quantity       2998 non-null   int64 
 16  Fee            2998 non-null   int64 
 17  State          2998 non-null   int64 
 18  RescuerID      2998 non-null   object
 19  VideoAmt       2998 non-null   int64 
 20  Description    2995 non-null   object
 21  PetID          2998 non-null   object
 22  PhotoAmt       2998 non-null   int64 
 23  AdoptionSpeed  2998 non-null   int64 
dtypes: int64(20), object(4)
memory usage: 585.5+ KB
In [5]:
#Get first five rows from dataset
df_train.head()
Out[5]:
Type Name Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize ... Health Quantity Fee State RescuerID VideoAmt Description PetID PhotoAmt AdoptionSpeed
0 2 Nibble 3 299 0 1 1 7 0 1 ... 1 1 100 41326 8480853f516546f6cf33aa88cd76c379 0 Nibble is a 3+ month old ball of cuteness. He ... 86e1089a3 1 2
1 2 No Name Yet 1 265 0 1 1 2 0 2 ... 1 1 0 41401 3082c7125d8fb66f7dd4bff4192c8b14 0 I just found it alone yesterday near my apartm... 6296e909a 2 0
2 1 Brisco 1 307 0 1 2 7 0 2 ... 1 1 0 41326 fa90fa5b1ee11c86938398b60abc32cb 0 Their pregnant mother was dumped by her irresp... 3422e4906 7 3
3 1 Miko 4 307 0 2 1 2 0 2 ... 1 1 150 41401 9238e4f44c71a75282e62f7136c6b240 0 Good guard dog, very alert, active, obedience ... 5842f1ff5 8 2
4 1 Hunter 1 307 0 1 1 0 0 2 ... 1 1 0 41326 95481e953f8aed9ec3d16fc4509537e8 0 This handsome yet cute boy is up for adoption.... 850a43f90 3 2

5 rows × 24 columns

In [7]:
#Get all columns names
print(df_train.columns)
Index(['Type', 'Name', 'Age', 'Breed1', 'Breed2', 'Gender', 'Color1', 'Color2',
       'Color3', 'MaturitySize', 'FurLength', 'Vaccinated', 'Dewormed',
       'Sterilized', 'Health', 'Quantity', 'Fee', 'State', 'RescuerID',
       'VideoAmt', 'Description', 'PetID', 'PhotoAmt', 'AdoptionSpeed'],
      dtype='object')

Filling the missing values

In the column Name and Description, there are some missing values that we are going to fill with a new class Undef

In [8]:
#For every columns with this function we see if somewhere in database hass null value
#But this is not so clearly
print(df_train.isnull())
        Type   Name    Age  Breed1  Breed2  Gender  Color1  Color2  Color3  \
0      False  False  False   False   False   False   False   False   False   
1      False  False  False   False   False   False   False   False   False   
2      False  False  False   False   False   False   False   False   False   
3      False  False  False   False   False   False   False   False   False   
4      False  False  False   False   False   False   False   False   False   
...      ...    ...    ...     ...     ...     ...     ...     ...     ...   
14988  False   True  False   False   False   False   False   False   False   
14989  False  False  False   False   False   False   False   False   False   
14990  False  False  False   False   False   False   False   False   False   
14991  False  False  False   False   False   False   False   False   False   
14992  False  False  False   False   False   False   False   False   False   

       MaturitySize  ...  Health  Quantity    Fee  State  RescuerID  VideoAmt  \
0             False  ...   False     False  False  False      False     False   
1             False  ...   False     False  False  False      False     False   
2             False  ...   False     False  False  False      False     False   
3             False  ...   False     False  False  False      False     False   
4             False  ...   False     False  False  False      False     False   
...             ...  ...     ...       ...    ...    ...        ...       ...   
14988         False  ...   False     False  False  False      False     False   
14989         False  ...   False     False  False  False      False     False   
14990         False  ...   False     False  False  False      False     False   
14991         False  ...   False     False  False  False      False     False   
14992         False  ...   False     False  False  False      False     False   

       Description  PetID  PhotoAmt  AdoptionSpeed  
0            False  False     False          False  
1            False  False     False          False  
2            False  False     False          False  
3            False  False     False          False  
4            False  False     False          False  
...            ...    ...       ...            ...  
14988        False  False     False          False  
14989        False  False     False          False  
14990        False  False     False          False  
14991        False  False     False          False  
14992        False  False     False          False  

[14993 rows x 24 columns]
In [9]:
# Print how many null values we have for all attributes
print(df_train.isnull().values.any())
df_train.isnull().sum()
True
Out[9]:
Type                0
Name             1257
Age                 0
Breed1              0
Breed2              0
Gender              0
Color1              0
Color2              0
Color3              0
MaturitySize        0
FurLength           0
Vaccinated          0
Dewormed            0
Sterilized          0
Health              0
Quantity            0
Fee                 0
State               0
RescuerID           0
VideoAmt            0
Description        12
PetID               0
PhotoAmt            0
AdoptionSpeed       0
dtype: int64
In [10]:
#Here is an example where Name is NaN in 36 row
df_train[30:40]
Out[10]:
Type Name Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize ... Health Quantity Fee State RescuerID VideoAmt Description PetID PhotoAmt AdoptionSpeed
30 1 Benji & Kimi 4 205 218 3 2 7 0 1 ... 1 2 0 41326 aa66486163b6cbc25ea62a34b11c9b91 0 Benji and his sister Kimi are a handsome pair ... 1a76190c5 5 3
31 2 Kekok 1 266 0 1 1 0 0 1 ... 1 1 0 41361 3cbe8b84e5b4852b5923c348fefcdf31 0 tataO betine ke jantan...ank pd ibu gak edb920079 1 1
32 1 BoiBoi 24 307 0 1 5 7 0 2 ... 1 5 0 41326 2147467fcd35e7a3bc23b9edcffc5702 0 Boiboi is rescued by my daughter 2 years ago f... 543130f60 1 4
33 2 NaN 4 266 0 2 1 6 7 1 ... 1 2 0 41326 7d58438884ab468dce87c7e252bbd6e4 0 Two gorgeous kittens have just lost their mumm... 9415bc79e 7 3
34 2 Kitten Girl Girl 3 266 0 2 1 7 0 1 ... 1 1 1 41327 6e4f4078c85aaa01b4059fbf679e6695 0 Open for adoption!!! 4c6fe4100 5 4
35 2 Tom 12 264 0 2 6 0 0 2 ... 1 1 0 41326 76f0a2be251f10691a62e16409b95e47 0 email for more enquiry 0737e4f11 5 1
36 2 NaN 24 265 292 2 1 4 0 2 ... 1 3 0 41326 90b00f90ffdf9ec1cac529a2bbef3ecc 0 she fat n healthy. in door cat 61fa73996 2 4
37 2 Comel 2 265 0 2 2 0 0 1 ... 1 1 0 41326 b020ad4c866a9f17be0ab3beaa38c78c 0 si comel ini memerlukan seseorang yang boleh m... eabb13cea 3 1
38 1 WHISKY 12 307 0 1 1 0 0 2 ... 1 1 0 41336 31de822d0adce3e2dad7dcedfbee2ba8 0 This is Whisky, rescued from the streets. He i... deff069a2 4 1
39 1 Boy 12 307 0 1 5 0 0 2 ... 1 1 0 41326 6757d0b9d5b72d8b78c20e355c7fe62c 0 #NAME? 4e3640544 1 3

10 rows × 24 columns

In [11]:
#(rows, columns) 
df_train.shape
Out[11]:
(14993, 24)
In [12]:
#(rows, columns) without null values
df_train.dropna().shape
Out[12]:
(13724, 24)

We can conclude that the missing values are a name and a description and that it is better to fill them in than to delete them. So we made a new class for that purpose which is called 'Undef' and we use it to fill in the place where the value is missing (is not defined).

In [13]:
df_train=df_train.fillna({'Name':'Undef','Description':'Undef'})
df_train[30:40]
Out[13]:
Type Name Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize ... Health Quantity Fee State RescuerID VideoAmt Description PetID PhotoAmt AdoptionSpeed
30 1 Benji & Kimi 4 205 218 3 2 7 0 1 ... 1 2 0 41326 aa66486163b6cbc25ea62a34b11c9b91 0 Benji and his sister Kimi are a handsome pair ... 1a76190c5 5 3
31 2 Kekok 1 266 0 1 1 0 0 1 ... 1 1 0 41361 3cbe8b84e5b4852b5923c348fefcdf31 0 tataO betine ke jantan...ank pd ibu gak edb920079 1 1
32 1 BoiBoi 24 307 0 1 5 7 0 2 ... 1 5 0 41326 2147467fcd35e7a3bc23b9edcffc5702 0 Boiboi is rescued by my daughter 2 years ago f... 543130f60 1 4
33 2 Undef 4 266 0 2 1 6 7 1 ... 1 2 0 41326 7d58438884ab468dce87c7e252bbd6e4 0 Two gorgeous kittens have just lost their mumm... 9415bc79e 7 3
34 2 Kitten Girl Girl 3 266 0 2 1 7 0 1 ... 1 1 1 41327 6e4f4078c85aaa01b4059fbf679e6695 0 Open for adoption!!! 4c6fe4100 5 4
35 2 Tom 12 264 0 2 6 0 0 2 ... 1 1 0 41326 76f0a2be251f10691a62e16409b95e47 0 email for more enquiry 0737e4f11 5 1
36 2 Undef 24 265 292 2 1 4 0 2 ... 1 3 0 41326 90b00f90ffdf9ec1cac529a2bbef3ecc 0 she fat n healthy. in door cat 61fa73996 2 4
37 2 Comel 2 265 0 2 2 0 0 1 ... 1 1 0 41326 b020ad4c866a9f17be0ab3beaa38c78c 0 si comel ini memerlukan seseorang yang boleh m... eabb13cea 3 1
38 1 WHISKY 12 307 0 1 1 0 0 2 ... 1 1 0 41336 31de822d0adce3e2dad7dcedfbee2ba8 0 This is Whisky, rescued from the streets. He i... deff069a2 4 1
39 1 Boy 12 307 0 1 5 0 0 2 ... 1 1 0 41326 6757d0b9d5b72d8b78c20e355c7fe62c 0 #NAME? 4e3640544 1 3

10 rows × 24 columns

In [14]:
df_train.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14993 entries, 0 to 14992
Data columns (total 24 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   Type           14993 non-null  int64 
 1   Name           14993 non-null  object
 2   Age            14993 non-null  int64 
 3   Breed1         14993 non-null  int64 
 4   Breed2         14993 non-null  int64 
 5   Gender         14993 non-null  int64 
 6   Color1         14993 non-null  int64 
 7   Color2         14993 non-null  int64 
 8   Color3         14993 non-null  int64 
 9   MaturitySize   14993 non-null  int64 
 10  FurLength      14993 non-null  int64 
 11  Vaccinated     14993 non-null  int64 
 12  Dewormed       14993 non-null  int64 
 13  Sterilized     14993 non-null  int64 
 14  Health         14993 non-null  int64 
 15  Quantity       14993 non-null  int64 
 16  Fee            14993 non-null  int64 
 17  State          14993 non-null  int64 
 18  RescuerID      14993 non-null  object
 19  VideoAmt       14993 non-null  int64 
 20  Description    14993 non-null  object
 21  PetID          14993 non-null  object
 22  PhotoAmt       14993 non-null  int64 
 23  AdoptionSpeed  14993 non-null  int64 
dtypes: int64(20), object(4)
memory usage: 2.7+ MB
In [15]:
test_df['Name']=test_df['Name'].fillna('Undef')
test_df
Out[15]:
Type Name Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize ... Health Quantity Fee State RescuerID VideoAmt Description PetID PhotoAmt AdoptionSpeed
13408 1 ♥♥♥ Lily ♥♥♥ 36 307 0 2 2 7 0 2 ... 1 1 0 41326 337914b09c2fa5460e195197e994ef98 0 Adorable 3 year old Lily looking for a forever... 3f8824a3b 1 4
6472 2 Cookie 3 266 0 1 6 7 0 2 ... 1 1 0 41327 4bb1ebb92158078ad54a6bb23c10dffc 0 i rescue this stary kitten from market near my... 9238eb7fc 1 2
9967 2 Favour Speedy Abundance And Courage 7 250 252 1 1 2 0 2 ... 1 4 0 41327 99ba8ce53b4d8515e417e7921563d923 0 The mother was a Burmese cross and had since p... f0a1f2b90 2 4
862 1 Undef 3 307 0 1 2 0 0 3 ... 1 1 0 41327 3f3ef74c486beba3bc87f6dbaee772bf 0 This puppy is: 1. Male 2. 3 months old 3. Brow... 7d028bdea 4 2
5967 2 Abandoned Kitty 1 266 0 1 1 6 7 1 ... 1 1 0 41401 844f03ab8054007d4be6686f3a9702b9 0 Mother cat gave birth to a litter of 3 and too... 8377bfe97 0 2
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
3944 1 Male_Puppy_7 Weeks Old 1 307 307 1 2 5 0 2 ... 1 1 0 41335 849d5c53eea804440adc2eebc4eecf4a 0 To All Kind Hearted, Hi. I came across a litte... 4d6350e54 2 1
8191 1 Monica 3 307 0 2 1 2 0 2 ... 1 1 50 41326 f0c3b065f8804b122ca7d61427d56661 0 Monica is 1 of 6 siblings who were abandoned a... 8a5d7e622 2 2
3297 1 Scott & Tyler 84 152 205 1 1 2 7 1 ... 1 2 0 41326 fe6b58872037cf640f7be8b7b7f48bb6 0 Scott - Miniature Pinscher Scott is a spirited... 380eaa884 11 2
14107 2 TT (TsingTao) 3 299 0 2 1 2 0 2 ... 1 1 100 41401 f7a1f357462dd40288bc0a8b048fdba6 0 Found this little sweetheart while on a holida... 237551493 4 2
5513 1 Boy 60 218 307 1 5 6 0 2 ... 1 1 0 41326 01deab6501d5c75bb614f38f574a3c24 0 I found him wandering in my condo area (USJ). ... 067f49481 5 4

2998 rows × 24 columns

In [16]:
test_df=test_df.fillna({'Name':'Undef','Description':'Undef'})

Summary Statistics

In [17]:
#since both of the datasets are following the same format the next summary statistics will be under the treining test only
df_train.describe()
Out[17]:
Type Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize FurLength Vaccinated Dewormed Sterilized Health Quantity Fee State VideoAmt PhotoAmt AdoptionSpeed
count 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000 14993.000000
mean 1.457614 10.452078 265.272594 74.009738 1.776162 2.234176 3.222837 1.882012 1.862002 1.467485 1.731208 1.558727 1.914227 1.036617 1.576069 21.259988 41346.028347 0.056760 3.889215 2.516441
std 0.498217 18.155790 60.056818 123.011575 0.681592 1.745225 2.742562 2.984086 0.547959 0.599070 0.667649 0.695817 0.566172 0.199535 1.472477 78.414548 32.444153 0.346185 3.487810 1.177265
min 1.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.000000 41324.000000 0.000000 0.000000 0.000000
25% 1.000000 2.000000 265.000000 0.000000 1.000000 1.000000 0.000000 0.000000 2.000000 1.000000 1.000000 1.000000 2.000000 1.000000 1.000000 0.000000 41326.000000 0.000000 2.000000 2.000000
50% 1.000000 3.000000 266.000000 0.000000 2.000000 2.000000 2.000000 0.000000 2.000000 1.000000 2.000000 1.000000 2.000000 1.000000 1.000000 0.000000 41326.000000 0.000000 3.000000 2.000000
75% 2.000000 12.000000 307.000000 179.000000 2.000000 3.000000 6.000000 5.000000 2.000000 2.000000 2.000000 2.000000 2.000000 1.000000 1.000000 0.000000 41401.000000 0.000000 5.000000 4.000000
max 2.000000 255.000000 307.000000 307.000000 3.000000 7.000000 7.000000 7.000000 4.000000 3.000000 3.000000 3.000000 3.000000 3.000000 20.000000 3000.000000 41415.000000 8.000000 30.000000 4.000000
In [18]:
features = df_train.columns.values[:]
number_of_features = len(features)
print("Number of features: ", number_of_features, features)
Number of features:  24 ['Type' 'Name' 'Age' 'Breed1' 'Breed2' 'Gender' 'Color1' 'Color2' 'Color3'
 'MaturitySize' 'FurLength' 'Vaccinated' 'Dewormed' 'Sterilized' 'Health'
 'Quantity' 'Fee' 'State' 'RescuerID' 'VideoAmt' 'Description' 'PetID'
 'PhotoAmt' 'AdoptionSpeed']

Analyse the most common color of animals

Even though Black and Brown are the most common colors for pets in the dataset There are more than 2000 black colored pets who are unadopted.

Let's also analyse AdoptionSpeed across various combinations of animal color. (ColorName1,ColorName2 and ColorName3)

In [19]:
color_labels = pd.read_csv("data/color_labels.csv")
In [20]:
new_dataFrame = pd.read_csv("data/train.csv")
In [21]:
new_dataFrame.columns
Out[21]:
Index(['Type', 'Name', 'Age', 'Breed1', 'Breed2', 'Gender', 'Color1', 'Color2',
       'Color3', 'MaturitySize', 'FurLength', 'Vaccinated', 'Dewormed',
       'Sterilized', 'Health', 'Quantity', 'Fee', 'State', 'RescuerID',
       'VideoAmt', 'Description', 'PetID', 'PhotoAmt', 'AdoptionSpeed'],
      dtype='object')
In [22]:
color_labels.columns
Out[22]:
Index(['ColorID', 'ColorName'], dtype='object')
In [23]:
color_labels
Out[23]:
ColorID ColorName
0 1 Black
1 2 Brown
2 3 Golden
3 4 Yellow
4 5 Cream
5 6 Gray
6 7 White
In [24]:
new_dataFrame
Out[24]:
Type Name Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize ... Health Quantity Fee State RescuerID VideoAmt Description PetID PhotoAmt AdoptionSpeed
0 2 Nibble 3 299 0 1 1 7 0 1 ... 1 1 100 41326 8480853f516546f6cf33aa88cd76c379 0 Nibble is a 3+ month old ball of cuteness. He ... 86e1089a3 1 2
1 2 No Name Yet 1 265 0 1 1 2 0 2 ... 1 1 0 41401 3082c7125d8fb66f7dd4bff4192c8b14 0 I just found it alone yesterday near my apartm... 6296e909a 2 0
2 1 Brisco 1 307 0 1 2 7 0 2 ... 1 1 0 41326 fa90fa5b1ee11c86938398b60abc32cb 0 Their pregnant mother was dumped by her irresp... 3422e4906 7 3
3 1 Miko 4 307 0 2 1 2 0 2 ... 1 1 150 41401 9238e4f44c71a75282e62f7136c6b240 0 Good guard dog, very alert, active, obedience ... 5842f1ff5 8 2
4 1 Hunter 1 307 0 1 1 0 0 2 ... 1 1 0 41326 95481e953f8aed9ec3d16fc4509537e8 0 This handsome yet cute boy is up for adoption.... 850a43f90 3 2
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
14988 2 NaN 2 266 0 3 1 0 0 2 ... 1 4 0 41326 61c84bd7bcb6fb31d2d480b1bcf9682e 0 I have 4 kittens that need to be adopt urgentl... dc0935a84 3 2
14989 2 Serato & Eddie 60 265 264 3 1 4 7 2 ... 1 2 0 41326 1d5096c4a5e159a3b750c5cfcf6ceabf 0 Serato(female cat- 3 color) is 4 years old and... a01ab5b30 3 4
14990 2 Monkies 2 265 266 3 5 6 7 3 ... 1 5 30 41326 6f40a7acfad5cc0bb3e44591ea446c05 0 Mix breed, good temperament kittens. Love huma... d981b6395 5 3
14991 2 Ms Daym 9 266 0 2 4 7 0 1 ... 1 1 0 41336 c311c0c569245baa147d91fa4e351ae4 0 she is very shy..adventures and independent..s... e4da1c9e4 3 4
14992 1 Fili 1 307 307 1 2 0 0 2 ... 1 1 0 41332 9ed1d5493d223eaa5024c1a031dbc9c2 0 Fili just loves laying around and also loves b... a83d95ead 1 3

14993 rows × 24 columns

In [25]:
color_labels.rename(columns={'ColorID':'Color1','ColorName':'ColorName1'},inplace = True)
new_dataFrame=new_dataFrame.merge(color_labels,on='Color1',how='left')
new_dataFrame.columns
Out[25]:
Index(['Type', 'Name', 'Age', 'Breed1', 'Breed2', 'Gender', 'Color1', 'Color2',
       'Color3', 'MaturitySize', 'FurLength', 'Vaccinated', 'Dewormed',
       'Sterilized', 'Health', 'Quantity', 'Fee', 'State', 'RescuerID',
       'VideoAmt', 'Description', 'PetID', 'PhotoAmt', 'AdoptionSpeed',
       'ColorName1'],
      dtype='object')
In [26]:
color_labels.rename(columns={'Color1':'Color2','ColorName1':'ColorName2'},inplace = True)
new_dataFrame=new_dataFrame.merge(color_labels,on='Color2',how='left')
new_dataFrame.columns
Out[26]:
Index(['Type', 'Name', 'Age', 'Breed1', 'Breed2', 'Gender', 'Color1', 'Color2',
       'Color3', 'MaturitySize', 'FurLength', 'Vaccinated', 'Dewormed',
       'Sterilized', 'Health', 'Quantity', 'Fee', 'State', 'RescuerID',
       'VideoAmt', 'Description', 'PetID', 'PhotoAmt', 'AdoptionSpeed',
       'ColorName1', 'ColorName2'],
      dtype='object')
In [27]:
color_labels.rename(columns={'Color2':'Color3','ColorName2':'ColorName3'},inplace = True)
new_dataFrame=new_dataFrame.merge(color_labels,on='Color3',how='left')
new_dataFrame = new_dataFrame.drop(["Color1",'Color2','Color3'],axis = 1)
new_dataFrame.columns
Out[27]:
Index(['Type', 'Name', 'Age', 'Breed1', 'Breed2', 'Gender', 'MaturitySize',
       'FurLength', 'Vaccinated', 'Dewormed', 'Sterilized', 'Health',
       'Quantity', 'Fee', 'State', 'RescuerID', 'VideoAmt', 'Description',
       'PetID', 'PhotoAmt', 'AdoptionSpeed', 'ColorName1', 'ColorName2',
       'ColorName3'],
      dtype='object')
In [28]:
new_dataFrame["All_Colors"] = new_dataFrame["ColorName1"] + new_dataFrame["ColorName2"] + new_dataFrame["ColorName3"]
new_dataFrame["All_Colors"].unique()
Out[28]:
array([nan, 'BlackBrownWhite', 'BlackBrownCream', 'BrownCreamGray',
       'BlackGrayWhite', 'BrownCreamWhite', 'GoldenGrayWhite',
       'YellowGrayWhite', 'BlackGoldenWhite', 'BlackGoldenGray',
       'BrownGoldenWhite', 'BlackYellowWhite', 'BrownGoldenYellow',
       'BlackBrownGolden', 'BlackYellowCream', 'YellowCreamWhite',
       'BrownYellowCream', 'GoldenCreamWhite', 'BlackBrownGray',
       'BrownYellowWhite', 'BrownGoldenCream', 'BlackCreamWhite',
       'BlackBrownYellow', 'BrownGrayWhite', 'GoldenYellowWhite',
       'BlackYellowGray', 'CreamGrayWhite', 'BrownGoldenGray',
       'BrownYellowGray', 'GoldenYellowCream', 'YellowCreamGray',
       'BlackGoldenCream', 'GoldenCreamGray', 'BlackCreamGray',
       'BlackGoldenYellow', 'GoldenYellowGray'], dtype=object)
In [29]:
#Top 5 color
top_5_colors = new_dataFrame["All_Colors"].value_counts()[:5]
top_5_color = dict(top_5_colors)
top_5_color
Out[29]:
{'BlackBrownWhite': 1159,
 'BlackGrayWhite': 449,
 'BlackYellowWhite': 353,
 'BrownCreamWhite': 274,
 'BlackBrownCream': 255}

Now let's analyse the effect of Fee on Adoption Speed

In [30]:
sum_fee = df_train["Fee"].value_counts()[:10]
sum_fee
Out[30]:
0      12663
50       468
100      408
200      219
150      162
20       136
300      120
30       103
250       92
1         82
Name: Fee, dtype: int64
In [31]:
print(sum_fee.sum())
14453

As we can see, more than 14,000 pets in the data are bought in under 300 Singapore Dollars. Most of the Animals (12663) in the listing who are adopted are free of cost. Let's analyse the Adoption Speed Based on the Fee charged and Type for the Animal

In [32]:
plt.figure(figsize=(12,8))
sns.scatterplot(x='AdoptionSpeed',y='Fee',data=df_train,palette="summer",hue='Type')
Out[32]:
<matplotlib.axes._subplots.AxesSubplot at 0x2b19e8e8a90>

From the plot, it is pretty clear that most pets within fee range below 1000 dollars are adopted more. There is only 1 dog costing 3000 $ who also is adopted within 2-3 months. This basically means that generally people adopt pets who are mid-range expensive

Visualizations

Plotting the Adoption Speed column

In [33]:
#plotting target column
f,a=plt.subplots(figsize=(8,6))
plt.title('Before log(1+x)')
sns.distplot(df_train['AdoptionSpeed'],fit=stats.norm)
plt.show()




obj = go.Box(y=df_train["AdoptionSpeed"], name="AdoptionSpeed", boxmean='sd', boxpoints = 'all')
fig = go.Figure([obj])
plotly.offline.iplot(fig, filename="Adoption Speed Box Plot")

Correlation matrix

With this matrix we want to see how much correlated are the independent columns with the dependent AdoptionSpeed. As a metric we are choosing the Spearman Correlation Metric since it checks monotony, which we can see if there is any relation beside the linear one and it works well with ordinal and continous variables together.

In [33]:
corrmat = df_train.corr(method='spearman')
layout = go.Layout(width=1000,
                   height=1000,
                   autosize=False)
obj = go.Heatmap(z=corrmat.values, x=corrmat.columns.values, y=corrmat.columns.values)
fig = go.Figure([obj], layout)
plotly.offline.iplot(fig, filename="Correlation matrix")

Plotting correlated features

For our plot, we will get all the features that have absolute value above 0.5 with the dependent AdoptionSpeed.

In [34]:
obj = go.Splom(dimensions=[dict(label='Type', values = df_train['Type']),
                           dict(label='Age', values=df_train['Age']),
                           dict(label='Breed1', values=df_train['Breed1']),
                           dict(label='Breed2', values=df_train['Breed2']),
                           dict(label='Gender', values=df_train['Gender']),
                           dict(label='Color1', values=df_train['Color1']),
                           dict(label='Color2', values=df_train['Color2']),
                           dict(label='Color3', values=df_train['Color3']),
                           dict(label='MaturitySize', values=df_train['MaturitySize']),
                           dict(label='FurLength', values=df_train['FurLength']),
                           dict(label='Vaccinated', values=df_train['Vaccinated']),
                           dict(label='Dewormed', values=df_train['Dewormed']),
                           dict(label='Sterilized', values=df_train['Sterilized']),
                           dict(label='Health', values=df_train['Health']),
                           dict(label='Quanitity', values=df_train['Quantity']),
                           dict(label='Fee', values=df_train['Fee']),
                           dict(label='State', values=df_train['State']),
                           dict(label='VideoAmt', values=df_train['VideoAmt']),
                          dict(label='PhotoAmt', values=df_train['PhotoAmt']),
                          dict(label='AdoptionSpeed', values=df_train['AdoptionSpeed'])],
                        marker=dict(size=5,
                           line=dict(width=0.5,
                                     color='rgb(230,230,230)')),
                           diagonal=dict(visible=False),
                           showupperhalf=False)
axisd = dict(showline=False,
           zeroline=False,
           gridcolor='#fff')

layout = go.Layout(title="Pairplot for adoption speed",
                   dragmode='select',
                   width=2000,
                   height=2000,
                   autosize=False,
                   hovermode='closest',
                   plot_bgcolor='rgba(240,240,240, 0.95)',
                   xaxis1=dict(axisd), xaxis2=dict(axisd), xaxis3=dict(axisd), xaxis4=dict(axisd), xaxis5=dict(axisd), xaxis6=dict(axisd),
                   xaxis7=dict(axisd), xaxis8=dict(axisd), xaxis9=dict(axisd), xaxis10=dict(axisd), xaxis11=dict(axisd), xaxis12=dict(axisd),
                   xaxis13=dict(axisd), xaxis14=dict(axisd), xaxis15=dict(axisd), xaxis16=dict(axisd), xaxis17=dict(axisd), xaxis18=dict(axisd),
                   xaxis19=dict(axisd),xaxis20=dict(axisd),
                   yaxis1=dict(axisd), yaxis2=dict(axisd), yaxis3=dict(axisd), yaxis4=dict(axisd), yaxis5=dict(axisd), yaxis6=dict(axisd),
                   yaxis7=dict(axisd), yaxis8=dict(axisd), yaxis9=dict(axisd), yaxis10=dict(axisd), yaxis11=dict(axisd), yaxis12=dict(axisd),
                   yaxis13=dict(axisd), yaxis14=dict(axisd), yaxis15=dict(axisd), yaxis16=dict(axisd), yaxis17=dict(axisd), yaxis18=dict(axisd),
                  yaxis19=dict(axisd),yaxis20=dict(axisd))


fig = go.Figure(data=[obj], layout=layout)
plotly.offline.iplot(fig, filename="pairplot")

Scaling the values

In [126]:
Data_train  = df_train.drop(columns = ['Name','RescuerID','Description','PetID','State'])
print(Data_train)
       Type  Age  Breed1  Breed2  Gender  Color1  Color2  Color3  \
0         2    3     299       0       1       1       7       0   
1         2    1     265       0       1       1       2       0   
2         1    1     307       0       1       2       7       0   
3         1    4     307       0       2       1       2       0   
4         1    1     307       0       1       1       0       0   
...     ...  ...     ...     ...     ...     ...     ...     ...   
14988     2    2     266       0       3       1       0       0   
14989     2   60     265     264       3       1       4       7   
14990     2    2     265     266       3       5       6       7   
14991     2    9     266       0       2       4       7       0   
14992     1    1     307     307       1       2       0       0   

       MaturitySize  FurLength  ...  Dewormed  Sterilized  Health  Quantity  \
0                 1          1  ...         2           2       1         1   
1                 2          2  ...         3           3       1         1   
2                 2          2  ...         1           2       1         1   
3                 2          1  ...         1           2       1         1   
4                 2          1  ...         2           2       1         1   
...             ...        ...  ...       ...         ...     ...       ...   
14988             2          2  ...         2           2       1         4   
14989             2          2  ...         1           1       1         2   
14990             3          2  ...         1           3       1         5   
14991             1          1  ...         1           1       1         1   
14992             2          1  ...         2           2       1         1   

       Fee  VideoAmt  PhotoAmt  AdoptionSpeed  SentMagnitude  SentScore  
0      100         0         1              0              0          0  
1        0         0         2              0              0          0  
2        0         0         7              1              0          0  
3      150         0         8              0              0          0  
4        0         0         3              0              0          0  
...    ...       ...       ...            ...            ...        ...  
14988    0         0         3              0              0          0  
14989    0         0         3              1              0          0  
14990   30         0         5              1              0          0  
14991    0         0         3              1              0          0  
14992    0         0         1              1              0          0  

[14993 rows x 21 columns]
In [127]:
scaler = StandardScaler()
Data_train = scaler.fit_transform(Data_train)
print("Data_train scaled =\n", Data_train)
Data_train scaled =
 [[ 1.08869186 -0.41046553  0.56161035 ... -0.99461199  0.
   0.        ]
 [ 1.08869186 -0.5206269  -0.00453908 ... -0.99461199  0.
   0.        ]
 [-0.91853355 -0.5206269   0.69482199 ...  1.00541719  0.
   0.        ]
 ...
 [ 1.08869186 -0.46554622 -0.00453908 ...  1.00541719  0.
   0.        ]
 [ 1.08869186 -0.07998143  0.01211237 ...  1.00541719  0.
   0.        ]
 [-0.91853355 -0.5206269   0.69482199 ...  1.00541719  0.
   0.        ]]

Principal Component Analysis

PCA for short, is a method for reducing the dimensionality of data. It can be thought of as a projection method where data with m-columns (features) is projected into a subspace with m or fewer columns, whilst retaining the essence of the original data. ) It is used to explain the variance-covariance structure of a set of variables through linear combinations. It is often used as a dimensionality-reduction technique.

Scatter Plot

In [128]:
pcaT = PCA(n_components=2)
principal_components_train = pcaT.fit_transform(Data_train)
print("principal_components (TRAIN) =\n", principal_components_train)



plt.figure(figsize=(20, 20))
plt.scatter(principal_components_train[:, 0], principal_components_train[:, 1])
plt.show()
principal_components (TRAIN) =
 [[ 0.67316306 -0.73953704]
 [ 2.27737277 -2.77557275]
 [-1.21673034  0.4214265 ]
 ...
 [ 1.92480964  1.58373889]
 [-1.17050019  1.18750555]
 [-0.05687433 -1.20809294]]

Choosing the right technique for prediction

Since our data in the column for Adoption Speed are numbers that are in an ascending manner and acording to the explanation for the meaning of each value it is intuitively that we would like to try regression techniques, but also we would like to make sure that regression is the right choice and that is why we would apply log transformation function and plot it. To prevent overfitting, maybe better choice would be to try some classification methods that are more suitable for this dataset. Because the classes are not binary, but multiclass, and most of the features by which we would like to classify are already labeled we would use some classification algorithms for multiclass-value attributes because this dataset is suitable for them.

As we can see from the pair plot, there are few points that don't fit with the crowd. There are especially visible on Fee and AdoptionSpeed cel-plot. It is a little unusual to adopt a pet with so high fees that has been given for adoption for a longer time, but we won't treat this values as outlier so we are not going to remove them. In this step, we will apply log transformation on AdoptionSpeed.

In [38]:
#log it
train_df["AdoptionSpeed"] = np.log1p(train_df["AdoptionSpeed"])
test_df["AdoptionSpeed"] = np.log1p(test_df["AdoptionSpeed"])
#plot again
f,a=plt.subplots(figsize=(8,6))
plt.title('After log(1+x)')
sns.distplot(train_df["AdoptionSpeed"],fit=stats.norm)
f,a=plt.subplots(figsize=(8,6))
stats.probplot(train_df["AdoptionSpeed"],plot=plt)
plt.show()

After theese plots, we decided that classification techniques may provide higher accuracy and prevent overfitting, because we are training a dataset with 14900 rows.

Classification

To start classifying, we firstly need to divide the datset into separate training and testing subsets, where the target feature in this case AdoptionSpeed, is set in the y variable, and the other relevant features that will play deciding role in the classification are the Type, Breed, Age, Colors, Quanity, MaturityAge, the pet being Dewormed, Sterilized and Vaccined, the length of the Furr, number of photos or videos uploaded, Health, which are in the X.

In [36]:
#This code was just for adjustment of the class Type, in case we make another clssification with different target

# Everything except Type variable
#X = df_train.drop("AdoptionSpeed", axis=1)
# Everything except Name, RescueID,Description,PetID,State variables
X=df_train.drop(columns = ['Name','RescuerID','Description','PetID','State','AdoptionSpeed'])

# Target variable
#y1 = df_train["Type"]
# Independent variables (no target column)
#print(X1.head())
X.tail()
Out[36]:
Type Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize FurLength Vaccinated Dewormed Sterilized Health Quantity Fee VideoAmt PhotoAmt
14988 2 2 266 0 3 1 0 0 2 2 2 2 2 1 4 0 0 3
14989 2 60 265 264 3 1 4 7 2 2 1 1 1 1 2 0 0 3
14990 2 2 265 266 3 5 6 7 3 2 2 1 3 1 5 30 0 5
14991 2 9 266 0 2 4 7 0 1 1 1 1 1 1 1 0 0 3
14992 1 1 307 307 1 2 0 0 2 1 2 2 2 1 1 0 0 1
In [37]:
# Random seed for reproducibility
np.random.seed(42)

# Split into train & test set
X_train, X_test, y_train, y_test = train_test_split(X, # independent variables 
                                                    y, # dependent variable
                                                    test_size = 0.2) # percentage of data to use for test set
In [38]:
X_train.head()
Out[38]:
Type Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize FurLength Vaccinated Dewormed Sterilized Health Quantity Fee VideoAmt PhotoAmt
6786 1 1 307 0 3 2 7 0 2 2 2 2 2 1 7 0 0 5
9837 1 2 307 307 2 2 7 0 2 2 2 1 2 1 1 0 0 3
7688 2 1 266 266 3 1 4 7 2 1 2 2 3 1 3 0 0 2
6556 1 2 103 307 2 1 2 0 2 2 2 2 2 1 1 0 0 1
11322 1 2 307 0 3 2 4 7 1 1 3 3 3 1 3 0 1 3

From here we can see that our Trainign data set is randomly choosen.

In [39]:
y_train, len(y_train)
Out[39]:
(6786     3
 9837     3
 7688     3
 6556     2
 11322    4
         ..
 5191     1
 13418    2
 5390     3
 860      3
 7270     3
 Name: AdoptionSpeed, Length: 11994, dtype: int64,
 11994)

Model choices

Now we've got our data prepared, we can start to fit models. We'll be using the following and comparing their results.

1.Logistic Regression - LogisticRegression()

2.K-Nearest Neighbors - KNeighboursClassifier()

3.RandomForest - RandomForestClassifier()

All of the algorithms in the Scikit-Learn library use the same functions, for training a model, model.fit(X_train, y_train) and for scoring a model model.score(X_test, y_test). score() returns the ratio of correct predictions (1.0 = 100% correct).

In [40]:
models = {"KNN": KNeighborsClassifier(),
          "Logistic Regression": LogisticRegression(), 
          "Random Forest": RandomForestClassifier()}

# Create function to fit and score models
def fit_and_score(models, X_train, X_test, y_train, y_test):
    """
    Fits and evaluates given machine learning models.
    models : a dict of different Scikit-Learn machine learning models
    X_train : training data
    X_test : testing data
    y_train : labels assosciated with training data
    y_test : labels assosciated with test data
    """
    # Random seed for reproducible results
    np.random.seed(42)
    # Make a list to keep model scores
    model_scores = {}
    # Loop through models
    for name, model in models.items():
        # Fit the model to the data
        model.fit(X_train, y_train)
        # Evaluate the model and append its score to model_scores
        model_scores[name] = model.score(X_test, y_test)
    return model_scores
In [41]:
import warnings; warnings.simplefilter('ignore')
model_scores = fit_and_score(models=models,
                             X_train=X_train,
                             X_test=X_test,
                             y_train=y_train,
                             y_test=y_test)
model_scores
Out[41]:
{'KNN': 0.33211070356785594,
 'Logistic Regression': 0.33411137045681893,
 'Random Forest': 0.3967989329776592}

Model Comparison

In [42]:
model_compare = pd.DataFrame(model_scores, index=['accuracy'])
model_compare.T.plot.bar();

Tune KNeighborsClassifier (K-Nearest Neighbors or KNN) by hand

There's one main hyperparameter we can tune for the K-Nearest Neighbors (KNN) algorithm, and that is number of neighbours. The default is 5 (n_neigbors=5). A small value of k means that noise will have a higher influence on the result and a large value make it computationally expensive, so we will leave the default.

In [43]:
# Create a list of train scores
train_scores = []

# Create a list of test scores
test_scores = []

# Create a list of different values for n_neighbors
neighbors = range(1, 21) # 1 to 20

# Setup algorithm
knn = KNeighborsClassifier()

# Loop through different neighbors values
for i in neighbors:
    knn.set_params(n_neighbors = i) # set neighbors value
    
    # Fit the algorithm
    knn.fit(X_train, y_train)
    
    # Update the training scores
    train_scores.append(knn.score(X_train, y_train))
    
    # Update the test scores
    test_scores.append(knn.score(X_test, y_test))
In [44]:
#KNN train scores
train_scores
Out[44]:
[0.9728197432049358,
 0.6449891612472903,
 0.5982991495747874,
 0.5623645155911289,
 0.5356011339002835,
 0.516008004002001,
 0.5022511255627814,
 0.4894113723528431,
 0.48432549608137404,
 0.4765716191429048,
 0.4694013673503418,
 0.4614807403701851,
 0.4558946139736535,
 0.4521427380356845,
 0.4491412372853093,
 0.44513923628480906,
 0.44022011005502754,
 0.4418876104719026,
 0.44130398532599635,
 0.439719859929965]
In [45]:
#Due to better understanding we are going to plot them
plt.plot(neighbors, train_scores, label="Train score")
plt.plot(neighbors, test_scores, label="Test score")
plt.xticks(np.arange(1, 21, 1))
plt.xlabel("Number of neighbors")
plt.ylabel("Model score")
plt.legend()

print(f"Maximum KNN score on the test data: {max(test_scores)*100:.2f}%")
Maximum KNN score on the test data: 36.31%

We've tuned KNN by hand but let's see how we can LogisticsRegression and RandomForestClassifier using RandomizedSearchCV.

Instead of us having to manually try different hyperparameters by hand, RandomizedSearchCV tries a number of different combinations, evaluates them and saves the best.

Tuning models with with RandomizedSearchCV

Reading the Scikit-Learn documentation for LogisticRegression, we find there's a number of different hyperparameters we can tune. The same for RandomForestClassifier.

Let's create a hyperparameter grid (a dictionary of different hyperparameters) for each and then test them out. We will use RandomizedSearchCV to try and tune our LogisticRegression model.

We'll pass it the different hyperparameters from log_reg_grid as well as set n_iter = 20. This means, RandomizedSearchCV will try 20 different combinations of hyperparameters from log_reg_grid and save the best ones.

In [46]:
log_reg_grid = {"C": np.logspace(-4, 4, 20),
                "solver": ["liblinear"]}

# Setup grid hyperparameter search for LogisticRegression
gs_log_reg = GridSearchCV(LogisticRegression(),
                          param_grid=log_reg_grid,
                          cv=5,
                          verbose=True)

# Fit grid hyperparameter search model
gs_log_reg.fit(X_train, y_train);
Fitting 5 folds for each of 20 candidates, totalling 100 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 100 out of 100 | elapsed:  1.8min finished
In [47]:
# Check the best parameters
gs_log_reg.best_params_
Out[47]:
{'C': 3792.690190732246, 'solver': 'liblinear'}
In [48]:
# Evaluate the model
gs_log_reg.score(X_test, y_test)
Out[48]:
0.3611203734578193

Our grid only has a maximum of 20 different hyperparameter combinations.

Note: If there are a large amount of hyperparameters combinations in your grid, GridSearchCV may take a long time to try them all out. This is why it's a good idea to start with RandomizedSearchCV, try a certain amount of combinations and then use GridSearchCV to refine them.

Evaluating a classification model, beyond accuracy

Now we've got a tuned model, let's get some of the metrics we discussed before.

We want:

ROC curve and AUC score - plot_roc_curve()

Confusion matrix - confusion_matrix()

Classification report - classification_report()

Precision - precision_score()

Recall - recall_score()

F1-score - f1_score()

Luckily, Scikit-Learn has these all built-in.

To access them, we'll have to use our model to make predictions on the test set. You can make predictions by calling predict() on a trained model and passing it the data you'd like to predict on.

We'll make predictions on the test data.

In [49]:
# Make preidctions on test data
y_preds = gs_log_reg.predict(X_test)
In [50]:
y_preds
Out[50]:
array([4, 2, 4, ..., 4, 4, 2], dtype=int64)
In [54]:
y_test
Out[54]:
13408    4
6472     2
9967     4
862      2
5967     2
        ..
8191     2
3297     2
14107    2
5513     4
9938     3
Name: AdoptionSpeed, Length: 2999, dtype: int64

Confusion matrix

A confusion matrix is a visual way to show where your model made the right predictions and where it made the wrong predictions (or in other words, got confused).

Scikit-Learn allows us to create a confusion matrix using confusion_matrix() and passing it the true labels and predicted labels.

In [51]:
# Display confusion matrix
print(confusion_matrix(y_test, y_preds))
[[  0  28  23   2  39]
 [  0 132 248  38 209]
 [  0 104 285  77 340]
 [  0  69 184  96 292]
 [  0  75 134  54 570]]
In [52]:
# Import Seaborn
import seaborn as sns
sns.set(font_scale=1.5) # Increase font size

def plot_conf_mat(y_test, y_preds):
    """
    Plots a confusion matrix using Seaborn's heatmap().
    """
    fig, ax = plt.subplots(figsize=(3, 3))
    ax = sns.heatmap(confusion_matrix(y_test, y_preds),
                     annot=True, # Annotate the boxes
                     cbar=False)
    plt.xlabel("true label")
    plt.ylabel("predicted label")
    
plot_conf_mat(y_test, y_preds)

Classification report

We can make a classification report using classification_report() and passing it the true labels as well as our models predicted labels.

A classification report will also give us information of the precision and recall of our model for each class.

In [53]:
import warnings; warnings.simplefilter('ignore')
# Show classification report
print(classification_report(y_test, y_preds))
              precision    recall  f1-score   support

           0       0.00      0.00      0.00        92
           1       0.32      0.21      0.26       627
           2       0.33      0.35      0.34       806
           3       0.36      0.15      0.21       641
           4       0.39      0.68      0.50       833

    accuracy                           0.36      2999
   macro avg       0.28      0.28      0.26      2999
weighted avg       0.34      0.36      0.33      2999

In [54]:
# Check best hyperparameters
gs_log_reg.best_params_
Out[54]:
{'C': 3792.690190732246, 'solver': 'liblinear'}
In [55]:
# Import cross_val_score
from sklearn.model_selection import cross_val_score

# Instantiate best model with best hyperparameters (found with GridSearchCV)
clf = LogisticRegression(C=0.23357214690901212,
                         solver="liblinear")
In [56]:
# Cross-validated accuracy score
cv_acc = cross_val_score(clf,
                         X,
                         y,
                         cv=5, # 5-fold cross-validation
                         scoring="accuracy") # accuracy as scoring
cv_acc
Out[56]:
array([0.35245082, 0.34844948, 0.36245415, 0.34589726, 0.33789193])
In [57]:
# Since there are 5 metrics here, we'll take the average.
cv_acc = np.mean(cv_acc)
cv_acc
Out[57]:
0.34942872885580495
In [58]:
# Fit an instance of LogisticRegression (taken from above)
clf.fit(X_train, y_train);
In [59]:
# Check coef_
clf.coef_
Out[59]:
array([[ 3.31851977e-01, -7.38818916e-03, -5.22651823e-03,
         9.91557711e-04, -1.71439529e-01,  3.15769179e-02,
         1.80186939e-02,  4.46596451e-02, -3.83233085e-01,
         3.45774045e-01, -1.30077845e-01,  1.63233287e-01,
        -2.32090760e-02, -3.67698491e-01, -1.05628839e-01,
        -9.73928740e-04,  1.79788866e-01, -6.45116468e-02],
       [ 3.21322412e-01, -1.47651431e-02, -4.86082779e-03,
        -6.10563757e-04, -1.34814518e-01,  3.31876362e-02,
        -9.98232469e-04,  5.14478979e-03, -1.62288432e-01,
         3.05862237e-01,  8.56043301e-02, -5.27728748e-02,
         2.28739079e-01, -4.02418509e-01, -4.54664064e-02,
        -4.03885613e-04, -1.85399909e-01, -1.44887515e-02],
       [-3.57399406e-02, -8.57981044e-03, -7.15172088e-04,
        -2.62953734e-04, -1.03015083e-01,  7.49317429e-03,
         3.36010036e-03,  3.69771740e-03, -8.36274017e-03,
         4.55271556e-02,  1.49678537e-01, -1.40436075e-01,
         1.13304557e-02, -2.19046765e-01, -1.05336809e-02,
         5.22451509e-05,  5.64289754e-02,  1.47146203e-02],
       [-3.16896004e-01, -3.80587392e-03, -1.35714715e-03,
         2.31808425e-04,  1.23535815e-01, -1.43481665e-02,
         1.03154334e-02, -1.43190327e-02,  4.44350023e-02,
        -1.08553734e-01,  3.57587392e-02, -1.05678841e-01,
        -1.87036740e-01, -5.89103715e-02, -6.84082177e-02,
        -1.00011696e-03, -6.19299313e-02,  6.92487002e-02],
       [-6.00814297e-02,  1.85099661e-02,  6.00038552e-03,
         1.80823370e-04,  7.13695315e-02, -3.63098413e-02,
        -1.73822703e-02,  4.15695384e-03,  3.32970361e-02,
        -3.00812817e-01, -2.38923910e-01,  2.23504245e-01,
        -1.24536995e-01,  2.21163621e-01,  1.04300536e-01,
         1.09622574e-03,  7.51006743e-02, -8.72503713e-02]])
In [60]:
features_dict = dict(zip(df_train.columns, list(clf.coef_[0])))
features_dict
Out[60]:
{'Type': 0.3318519765792061,
 'Name': -0.0073881891626917705,
 'Age': -0.0052265182308968644,
 'Breed1': 0.0009915577106695392,
 'Breed2': -0.1714395291762436,
 'Gender': 0.031576917949486555,
 'Color1': 0.0180186939117311,
 'Color2': 0.044659645144588714,
 'Color3': -0.3832330853836443,
 'MaturitySize': 0.3457740454812721,
 'FurLength': -0.1300778447521587,
 'Vaccinated': 0.16323328711173418,
 'Dewormed': -0.02320907595136705,
 'Sterilized': -0.36769849149485484,
 'Health': -0.10562883945577475,
 'Quantity': -0.0009739287395131739,
 'Fee': 0.17978886596123064,
 'State': -0.06451164683523318}
In [61]:
#Now we've match the feature coefficients to different features, let's visualize them.
# Visualize feature importance
features_df = pd.DataFrame(features_dict, index=[0])
features_df.T.plot.bar(title="Feature Importance", legend=False);
In [62]:
pd.crosstab(df_train["AdoptionSpeed"],df_train["Gender"])
Out[62]:
Gender 1 2 3
AdoptionSpeed
0 160 204 46
1 1283 1366 441
2 1578 1911 548
3 1109 1671 479
4 1406 2125 666
In [63]:
pd.crosstab(df_train["AdoptionSpeed"],df_train["Type"])
Out[63]:
Type 1 2
AdoptionSpeed
0 170 240
1 1435 1655
2 2164 1873
3 1949 1310
4 2414 1783
In [64]:
pd.crosstab(df_train["AdoptionSpeed"],df_train["Dewormed"])
Out[64]:
Dewormed 1 2 3
AdoptionSpeed
0 205 146 59
1 1572 1188 330
2 2273 1347 417
3 1988 914 357
4 2359 1220 618
In [65]:
pd.crosstab(df_train["AdoptionSpeed"],df_train["Vaccinated"])
Out[65]:
Vaccinated 1 2 3
AdoptionSpeed
0 146 206 58
1 965 1777 348
2 1473 2112 452
3 1419 1459 381
4 1895 1673 629
In [66]:
df_train.groupby(['AdoptionSpeed']).mean()
Out[66]:
Type Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize FurLength Vaccinated Dewormed Sterilized Health Quantity Fee State VideoAmt PhotoAmt
AdoptionSpeed
0 1.585366 10.451220 251.097561 96.575610 1.721951 2.402439 3.509756 2.158537 1.775610 1.663415 1.785366 1.643902 2.000000 1.046341 1.414634 22.085366 41347.480488 0.060976 3.324390
1 1.535599 8.488350 255.885113 73.209061 1.727508 2.374434 3.339482 1.869903 1.821359 1.548544 1.800324 1.598058 1.994822 1.030097 1.464401 21.822330 41346.546602 0.044984 3.727184
2 1.463958 8.823631 265.928908 72.624474 1.744860 2.233589 3.251672 1.910329 1.862026 1.467179 1.747089 1.540253 1.926431 1.029230 1.551400 21.582611 41345.193213 0.063413 4.071836
3 1.401964 10.189936 262.409328 80.782755 1.806689 2.169684 3.248236 1.820497 1.885548 1.435410 1.681497 1.499540 1.867444 1.036821 1.534213 20.151580 41343.587297 0.072722 4.620743
4 1.424827 13.667858 275.160829 68.467953 1.823684 2.165118 3.061472 1.884441 1.882059 1.413867 1.698356 1.585180 1.871098 1.047415 1.730284 21.315702 41348.203717 0.046223 3.319990
In [67]:
%matplotlib inline
import matplotlib.pyplot as plt
In [68]:
df_train["AdoptionSpeed"].hist()
Out[68]:
<matplotlib.axes._subplots.AxesSubplot at 0x2b1a109e1f0>
In [74]:
#scatter plot
fig,ax=plt.subplots(figsize=(10,10))
ax.scatter(df_train["AdoptionSpeed"],df_train["PhotoAmt"])
Out[74]:
<matplotlib.collections.PathCollection at 0x27d44f49ca0>

Since we saw in our previous steps that RandomForestClassifier gave us the best score, we wanted to use another kernel baseline random forest, to demonstrate a routine, get_class_bounds(), that maps real-valued ordinal classes to integer classes based on the known class-distribution of the target y values.

Baseline Random Forest

This kernel uses Kseniia Palin's kernel baseline random forest to demonstrate a routine, get_class_bounds(), that maps real-valued ordinal classes to integer classes based on the known class-distribution of the target y values.

Comments have been added to Palin's code but otherwise the actual data preparation and model fitting are left as-is; the purpose of this kernel is to demonstrate get_class_bounds() rather than optimize the model. Note that Palin's implementation here generates Test predictions for each of the k-folds and averages those predictions; this is different from fitting the whole training set and using that model to generate a single Test prediction.

get_class_bounds() is similar to the OptimizedRounder used in other kernels, but it runs more quickly and it may be less prone to overfitting. It could also be used in OptimizedRounder to set the initial boundary values, e.g., in place of initial_coef = [0.5, 1.5, 2.5, 3.5].

In [69]:
df_train.head()
Out[69]:
Type Name Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize ... Health Quantity Fee State RescuerID VideoAmt Description PetID PhotoAmt AdoptionSpeed
0 2 Nibble 3 299 0 1 1 7 0 1 ... 1 1 100 41326 8480853f516546f6cf33aa88cd76c379 0 Nibble is a 3+ month old ball of cuteness. He ... 86e1089a3 1 2
1 2 No Name Yet 1 265 0 1 1 2 0 2 ... 1 1 0 41401 3082c7125d8fb66f7dd4bff4192c8b14 0 I just found it alone yesterday near my apartm... 6296e909a 2 0
2 1 Brisco 1 307 0 1 2 7 0 2 ... 1 1 0 41326 fa90fa5b1ee11c86938398b60abc32cb 0 Their pregnant mother was dumped by her irresp... 3422e4906 7 3
3 1 Miko 4 307 0 2 1 2 0 2 ... 1 1 150 41401 9238e4f44c71a75282e62f7136c6b240 0 Good guard dog, very alert, active, obedience ... 5842f1ff5 8 2
4 1 Hunter 1 307 0 1 1 0 0 2 ... 1 1 0 41326 95481e953f8aed9ec3d16fc4509537e8 0 This handsome yet cute boy is up for adoption.... 850a43f90 3 2

5 rows × 24 columns

In [70]:
test_df = pd.read_csv("data/test.csv")
In [71]:
# Define routines to read in the Training,Test sentiment score and magnitude;
# 0,0 is returned if there is no file found.
# Note that when used the argument fn will be one row of a dataframe
# in shich case fn['PetID'] is the PetId.

def readFile(fn):
    file = '../WBS-Project/data/train/'+fn['PetID']+'.json'
    if os.path.exists(file):
        with open(file) as data_file:    
            data = json.load(data_file)  

        df = json_normalize(data)
        mag = df['documentSentiment.magnitude'].values[0]
        score = df['documentSentiment.score'].values[0]
        return pd.Series([mag,score],index=['mag','score']) 
    else:
        return pd.Series([0,0],index=['mag','score'])
    
def readTestFile(fn):
    file = '../WBS-Project/data/test/'+fn['PetID']+'.json'
    if os.path.exists(file):
        with open(file) as data_file:    
            data = json.load(data_file)  

        df = json_normalize(data)
        mag = df['documentSentiment.magnitude'].values[0]
        score = df['documentSentiment.score'].values[0]
        return pd.Series([mag,score],index=['mag','score']) 
    else:
        return pd.Series([0,0],index=['mag','score'])
In [72]:
# Here the routines above are applied to each row of the dataframes.
# This is done using panadas' `apply()` with a small "anonymous function" defined with a python `lambda`.
# Note that just `train` could be used inplace of `train[['PetID']]`,
# this would make it clearer that x is a row of the dataframe and not the PetID value.

df_train[['SentMagnitude', 'SentScore']] = df_train[['PetID']].apply(lambda x: readFile(x), axis=1)
test_df[['SentMagnitude', 'SentScore']] = test_df[['PetID']].apply(lambda x: readTestFile(x), axis=1)

Do Machine Learning

In [73]:
# Setup the training X, y, and test X
train_X = df_train.drop(['Name', 'Description', 'RescuerID', 'PetID', 'AdoptionSpeed'], axis=1)
train_y = df_train['AdoptionSpeed']
test_X = test_df.drop(['Name', 'Description', 'RescuerID', 'PetID'], axis=1)
In [74]:
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import accuracy_score
from sklearn.metrics import cohen_kappa_score
from sklearn.ensemble import RandomForestClassifier
In [75]:
# Define what will be the final predicted train and test values
train_meta = np.zeros(train_y.shape)
test_meta = np.zeros(test_X.shape[0])

# Choose and initialize a model.
clf = RandomForestClassifier(bootstrap=True, criterion = 'gini', max_depth=80,
                             max_features='auto', min_samples_leaf=5,
                             min_samples_split=5, n_estimators=200)

# Divide the training data into k-folds, k=4 here.
splits = list(StratifiedKFold(n_splits=4, shuffle=True, random_state=1812).split(train_X, train_y))

# Loop over the folds and fit the model to the fold's training data.
# Then evaluate that model on i) the validation data of that fold, 
# and ii) on all of the test data.
for idx, (train_idx, valid_idx) in enumerate(splits):
        # The training and validation sets for this fold
        X_train = train_X.iloc[train_idx]
        y_train = train_y[train_idx]
        X_val = train_X.iloc[valid_idx]
        y_val = train_y[valid_idx]
        
        # Fit the model
        clf.fit(X_train, y_train)
         # Look at the validation kappa and accuracy with classes right from the model
        y_pred = clf.predict(X_val)
        print("Fold {}: accuracy = {:.1f}%, kappa = {:.4f}  (no boundary adjustment)".format(idx,
                                100.0*accuracy_score(y_val, y_pred),     
                                cohen_kappa_score(y_val, y_pred, weights='quadratic')))
        #
        # Assign real-valued classes in addition to the integer classes of y_pred.
        # Start with the predicted probabilities by class
        y_probs = clf.predict_proba(X_val)
        # and get the class values (use a copy incase we change values)
        class_vals = clf.classes_.copy()
        # Change the ordinal weight of class 0 to be -1 as suggested by the plot in discussion:
        # https://www.kaggle.com/c/petfinder-adoption-prediction/discussion/76265
        # Does mot make much difference, though.
        class_vals[0] = -1
        # Create the float class values as the probability-weighted class
        # Here a python "list comprehension" is used rather than a loop.
        y_floats = [sum(y_probs[ix]*class_vals) for ix in range(len(y_probs[:,0]))]
        #   
        # Save these y_float values instead of the y_pred integers;
        ##train_meta[valid_idx] = y_pred.reshape(-1)
        train_meta[valid_idx] = y_floats
        # the predictions for just this validation fold are saved in the train_meta array;
        # looping over all folds will provide one prediction for each training sample.
        # Now use this fold's same model to generate Test predictions.
        ##y_test = clf.predict(test_X)
        # Instead of integer classes, get the predicted probabilites
        test_probs = clf.predict_proba(test_X)
        # and turn these into float class values.
        # Unlike the validation case, we get a test prediction from every fold,
        # so those float predictions are averaged. python list comprehension is used again.
        ##test_meta += y_test.reshape(-1) / len(splits)
        test_meta += np.array([sum(test_probs[ix]*class_vals) for
                               ix in range(len(test_probs[:,0]))]) / len(splits)
Fold 0: accuracy = 41.1%, kappa = 0.3398  (no boundary adjustment)
Fold 1: accuracy = 42.3%, kappa = 0.3571  (no boundary adjustment)
Fold 2: accuracy = 41.9%, kappa = 0.3483  (no boundary adjustment)
Fold 3: accuracy = 39.4%, kappa = 0.3229  (no boundary adjustment)
In [ ]:
 

Adjusting the Class Boundaries

In [76]:
# Next two routines are a way to map float regression values to ordinal classes
# by making use of the known distribution of the training classes.

# In the following, y_pred is a floating value, e.g., the output of a regression to the class.
# Many sklearn _classifiers_ can also provide probabilities of the classes which
# can be turned into a floating value as the probability-weighted class, e.g.,:
#       y_probs = clf.predict_proba(X_val)
#       # The class values; use a copy incase we want to modify the values
#       class_vals = clf.classes_.copy()
#       y_floats = [sum(y_probs[ix]*class_vals) for ix in range(len(y_probs[:,0]))]


def get_class_bounds(y, y_pred, N=5, class0_fraction=-1):
    """
    Find boundary values for y_pred to match the known y class percentiles.
    Returns N-1 boundaries in y_pred values that separate y_pred
    into N classes (0, 1, 2, ..., N-1) with same percentiles as y has.
    Can adjust the fraction in Class 0 by the given factor (>=0), if desired. 
    """
    ysort = np.sort(y)
    predsort = np.sort(y_pred)
    bounds = []
    for ibound in range(N-1):
        iy = len(ysort[ysort <= ibound])
        # adjust the number of class 0 predictions?
        if (ibound == 0) and (class0_fraction >= 0.0) :
            iy = int(class0_fraction * iy)
            bounds.append(predsort[iy])
    return bounds

def assign_class(y_pred, boundaries):
    """
    Given class boundaries in y_pred units, output integer class values
    """
    y_classes = np.zeros(len(y_pred))
    for iclass, bound in enumerate(boundaries):
        y_classes[y_pred >= bound] = iclass + 1
    return y_classes.astype(int)
In [77]:
# Look at the histogram of the predicted float class values.
plt.hist(train_meta, bins=50)
plt.title("Training: meta float values")
plt.xlabel("Training y float values")
plt.show()
In [78]:
# This cell calculates and plots the kappa (and MSE) vs the class0 fraction adjustment.
# Note that MSE prefers (lower MSE) a class0 fraction near/at 0,
# whereas kappa prefers (higher kappa) a fraction near 1.
# Then the class0 fraction that gives best training kappa is selected.

# Save values of kappa, MSE, and accuracy vs the class0 fraction
kappas = []
mses = []
accurs = []
# fractions to try... (could go larger than 1 if desired.)
cl0fracs = np.array(np.arange(0.01,1.001,0.01))
for cl0frac in cl0fracs:
    boundaries = get_class_bounds(train_y, train_meta, class0_fraction=cl0frac)
    train_meta_ints = assign_class(train_meta, boundaries)
    kappa = cohen_kappa_score(df_train['AdoptionSpeed'], train_meta_ints, weights='quadratic')
    kappas.append(kappa)
    mse = mean_squared_error(df_train['AdoptionSpeed'], train_meta_ints)
    mses.append(mse)
    accur = accuracy_score(df_train['AdoptionSpeed'], train_meta_ints)
    accurs.append(accur)
    
# Use the class0 fraction that gives the highest training kappa
ifmax = np.array(kappas).argmax()
cl0frac = cl0fracs[ifmax]

print("Best kappa for class0 fraction = {:.4f}".format(cl0frac))
Best kappa for class0 fraction = 1.0000
In [79]:
# Plots to show the kappa, MSE, and Accuracy vs class0 fraction

plt.plot(cl0fracs, kappas)
# indicate the highest-kappa point
plt.plot([cl0frac],[kappas[ifmax]],marker="o",color="green")
plt.title("Training: kappa vs class0_fraction")
plt.xlabel("class0_fraction")
plt.ylabel("kappa")
plt.show()

plt.plot(cl0fracs, mses)
plt.title("Training: MSE vs class0_fraction")
plt.xlabel("class0_fraction")
plt.ylabel("MSE")
plt.show()

plt.plot(cl0fracs, accurs)
plt.title("Training: Accuracy vs class0_fraction")
plt.xlabel("class0_fraction")
plt.ylabel("Accuracy")
plt.show()
In [80]:
# Can skip the class0_fraction adjustment and plotting cells above;
# can delete those two cells and just uncomment this line:
##cl0frac = 1.0

print("Using class0_fraction = {:.4f}, gives boundaries:".format(cl0frac))
boundaries = get_class_bounds(train_y, train_meta, class0_fraction=cl0frac)
print(boundaries)

train_meta_ints = assign_class(train_meta, boundaries)
kappa = cohen_kappa_score(train_y, train_meta_ints, weights='quadratic')

print("Adjusted boundaries give:")
print("kappa = {:.4f}  (with accuracy = {:.1f}%)".format(kappa,
                                100.0*accuracy_score(train_y, train_meta_ints)))
Using class0_fraction = 1.0000, gives boundaries:
[1.812640295464131]
Adjusted boundaries give:
kappa = 0.0130  (with accuracy = 19.7%)
In [81]:
# Confusion Matrix
con_mat = confusion_matrix(train_y, train_meta_ints)

# Look at the number that are on the diagonal (exact agreement)
diag = 0.0
for id in range(5):
    diag += con_mat[id,id]
print("\nConfusion matrix - Columns are prediced 0, predicted 1, etc.\n")
print(con_mat)
print("")
print("\n{2:.2f}% = {0}/{1} are on the diagonal (= accuracy)".format(
        int(diag), con_mat.sum(), 100.0*diag/con_mat.sum()))
Confusion matrix - Columns are prediced 0, predicted 1, etc.

[[  35  375    0    0    0]
 [ 175 2915    0    0    0]
 [ 129 3908    0    0    0]
 [  55 3204    0    0    0]
 [  16 4181    0    0    0]]


19.68% = 2950/14993 are on the diagonal (= accuracy)
In [82]:
plt.hist(train_meta_ints, bins=40, color='blue')
plt.hist(train_y, bins=20, bottom=0.0, alpha=0.2)
plt.title("Train: Boundary-based Predictions")
plt.show()

Generate and Output the Test Predictions

In [83]:
plt.hist(test_meta, bins=50)
plt.title("Test: meta float values")
plt.show()
In [84]:
# Map the test values to integers using the training boundaries
test_meta_ints = assign_class(test_meta, boundaries)
plt.hist(test_meta_ints.astype(int), bins=50)
plt.title("Test: Boundary-based Predictions")
plt.show()
In [85]:
sub = pd.read_csv('../WBS-Project/data/sample_submission.csv')
In [86]:
sub['AdoptionSpeed'] = test_meta_ints
sub['AdoptionSpeed'] = sub['AdoptionSpeed'].astype(int)
sub.to_csv("submission.csv", index=False)
In [87]:
submission = pd.read_csv('submission.csv')
submission.head()
Out[87]:
PetID AdoptionSpeed
0 e2dfc2935 1
1 f153b465f 1
2 3c90f3f54 1
3 e02abc8a3 1
4 09f0df7d1 1

Dataset adjustment

Because of the low scores of accuracy, now we will try to adjust our dataset, in that way that we will make our target feature binary. For the values 0,1,2 we will put 0, and for the values 3 and 4 we will put 1 and train tha data so we would compare the scores. In the next steps we will use SMOTE, to balance the samples of 0 and 1 so that we will have bigger trust in the further testing of the scores from the training models.

In [88]:
#tmp=df_train["Type"==2].fillna(0,inplace=True)

tmp=df_train
tmp

tmp['AdoptionSpeed']=tmp['AdoptionSpeed'].replace([1],0)
tmp['AdoptionSpeed']=tmp['AdoptionSpeed'].replace([2],0)
tmp['AdoptionSpeed']=tmp['AdoptionSpeed'].replace([3],1)
tmp['AdoptionSpeed']=tmp['AdoptionSpeed'].replace([4],1)
tmp
Out[88]:
Type Name Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize ... Fee State RescuerID VideoAmt Description PetID PhotoAmt AdoptionSpeed SentMagnitude SentScore
0 2 Nibble 3 299 0 1 1 7 0 1 ... 100 41326 8480853f516546f6cf33aa88cd76c379 0 Nibble is a 3+ month old ball of cuteness. He ... 86e1089a3 1 0 0 0
1 2 No Name Yet 1 265 0 1 1 2 0 2 ... 0 41401 3082c7125d8fb66f7dd4bff4192c8b14 0 I just found it alone yesterday near my apartm... 6296e909a 2 0 0 0
2 1 Brisco 1 307 0 1 2 7 0 2 ... 0 41326 fa90fa5b1ee11c86938398b60abc32cb 0 Their pregnant mother was dumped by her irresp... 3422e4906 7 1 0 0
3 1 Miko 4 307 0 2 1 2 0 2 ... 150 41401 9238e4f44c71a75282e62f7136c6b240 0 Good guard dog, very alert, active, obedience ... 5842f1ff5 8 0 0 0
4 1 Hunter 1 307 0 1 1 0 0 2 ... 0 41326 95481e953f8aed9ec3d16fc4509537e8 0 This handsome yet cute boy is up for adoption.... 850a43f90 3 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
14988 2 Undef 2 266 0 3 1 0 0 2 ... 0 41326 61c84bd7bcb6fb31d2d480b1bcf9682e 0 I have 4 kittens that need to be adopt urgentl... dc0935a84 3 0 0 0
14989 2 Serato & Eddie 60 265 264 3 1 4 7 2 ... 0 41326 1d5096c4a5e159a3b750c5cfcf6ceabf 0 Serato(female cat- 3 color) is 4 years old and... a01ab5b30 3 1 0 0
14990 2 Monkies 2 265 266 3 5 6 7 3 ... 30 41326 6f40a7acfad5cc0bb3e44591ea446c05 0 Mix breed, good temperament kittens. Love huma... d981b6395 5 1 0 0
14991 2 Ms Daym 9 266 0 2 4 7 0 1 ... 0 41336 c311c0c569245baa147d91fa4e351ae4 0 she is very shy..adventures and independent..s... e4da1c9e4 3 1 0 0
14992 1 Fili 1 307 307 1 2 0 0 2 ... 0 41332 9ed1d5493d223eaa5024c1a031dbc9c2 0 Fili just loves laying around and also loves b... a83d95ead 1 1 0 0

14993 rows × 26 columns

In [89]:
# Everything except AdoptionSpeed variable
#and the irrelevant attributes
X1=tmp.drop(columns = ['Name','RescuerID','Description','PetID','State','AdoptionSpeed','SentMagnitude','SentScore'])
y1=tmp["AdoptionSpeed"]
# Target variable
#y1 = df_train["Type"]
# Independent variables (no target column)
#print(X1.head())
X1.tail()
Out[89]:
Type Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize FurLength Vaccinated Dewormed Sterilized Health Quantity Fee VideoAmt PhotoAmt
14988 2 2 266 0 3 1 0 0 2 2 2 2 2 1 4 0 0 3
14989 2 60 265 264 3 1 4 7 2 2 1 1 1 1 2 0 0 3
14990 2 2 265 266 3 5 6 7 3 2 2 1 3 1 5 30 0 5
14991 2 9 266 0 2 4 7 0 1 1 1 1 1 1 1 0 0 3
14992 1 1 307 307 1 2 0 0 2 1 2 2 2 1 1 0 0 1
In [90]:
#reassigning the target value,
#but now the values are adjusted so the type of the class is binary 
# Target variable
y1 = tmp["AdoptionSpeed"]
y1
Out[90]:
0        0
1        0
2        1
3        0
4        0
        ..
14988    0
14989    1
14990    1
14991    1
14992    1
Name: AdoptionSpeed, Length: 14993, dtype: int64
In [91]:
# Random seed for reproducibility
np.random.seed(42)

# Split into train & test set
X_train, X_test, y_train, y_test = train_test_split(X1, # independent variables 
                                                    y1, # dependent variable
                                                    test_size = 0.2) # percentage of data to use for test set
In [92]:
tmp.groupby('AdoptionSpeed').count()
Out[92]:
Type Name Age Breed1 Breed2 Gender Color1 Color2 Color3 MaturitySize ... Quantity Fee State RescuerID VideoAmt Description PetID PhotoAmt SentMagnitude SentScore
AdoptionSpeed
0 7537 7537 7537 7537 7537 7537 7537 7537 7537 7537 ... 7537 7537 7537 7537 7537 7537 7537 7537 7537 7537
1 7456 7456 7456 7456 7456 7456 7456 7456 7456 7456 ... 7456 7456 7456 7456 7456 7456 7456 7456 7456 7456

2 rows × 25 columns

Synthetic Minority Oversampling Technique (SMOTE)

In cases where the target feature has imbalanced amount of values, this method - SMOTE is used to give better balance to the dataset, so there is better chance that the accuracy would be more precise. Since we checked that our class does not have imbalance, we are not going to use SMOTE.

In [137]:
#from collections import Counter
#from imblearn.over_sampling import SMOTE 
#print (Counter(y1))
#sm = SMOTE(random_state=42)
#X_res, y_res = sm.fit_resample(X1, y1)
#print('Resampled dataset shape %s' % Counter(y_res))
In [138]:
#print (X_res.shape)
#print (y_res.shape)
In [121]:
#X_train, X_test, y_train, y_test = train_test_split(X_res, y_res, test_size=0.2, random_state=1,stratify=y_res)
In [114]:
X1
Out[114]:
['Type',
 'Age',
 'Breed1',
 'Breed2',
 'Gender',
 'Color1',
 'Color2',
 'Color3',
 'MaturitySize',
 'FurLength',
 'Vaccinated',
 'Dewormed',
 'Sterilized',
 'Health',
 'Quantity',
 'Fee',
 'VideoAmt',
 'PhotoAmt']

Logistic Regression

In [109]:
#data_final_vars=X1.reshape(1, -1)
data_final_vars=X1
y1=['y']
X1=[i for i in data_final_vars if i not in y]

from sklearn.feature_selection import RFE

lr=LogisticRegression()
rfe=RFE(lr,20)
rfe=rfe.fit(X_train, y_train)
print(rfe.support_)
print(rfe.ranking_)
[ True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True]
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]

RFE

Recursive Feature Elimination(RFE) is based on the idea to repeatedly construct a model and choose either the best or worst performing feature, setting the feature aside and then repeating the process with the rest of the features.This process is applied until all features in the dataset are exhausted. The goal of RFE is to select features by recursively considering smaller and smaller sets of features.

The ReF has helped us select the following features: 'Type', 'Age', 'Breed1', 'Breed2', 'Gender', 'Color1', 'Color2', 'Color3', 'MaturitySize', 'FurLength', 'Vaccinated', 'Dewormed', 'Sterilized', 'Health', 'Quantity', 'Fee', 'VideoAmt', 'PhotoAmt'.

In [118]:
cols=['Type',
 'Age',
 'Breed1',
 'Breed2',
 'Gender',
 'Color1',
 'Color2',
 'Color3',
 'MaturitySize',
 'FurLength',
 'Vaccinated',
 'Dewormed',
 'Sterilized',
 'Health',
 'Quantity',
 'Fee',
 'VideoAmt',
 'PhotoAmt']
X1=X_train[cols]
y1=y_train
In [120]:
import statsmodels.api as sm 
logit_model=sm.Logit(y1,X1)
result=logit_model.fit()
print(result.summary2())
Optimization terminated successfully.
         Current function value: 0.670423
         Iterations 5
                          Results: Logit
==================================================================
Model:              Logit            Pseudo R-squared: 0.033      
Dependent Variable: AdoptionSpeed    AIC:              16118.0977 
Date:               2020-09-29 20:13 BIC:              16251.1566 
No. Observations:   11994            Log-Likelihood:   -8041.0    
Df Model:           17               LL-Null:          -8313.6    
Df Residuals:       11976            LLR p-value:      5.9710e-105
Converged:          1.0000           Scale:            1.0000     
No. Iterations:     5.0000                                        
-------------------------------------------------------------------
                Coef.   Std.Err.     z     P>|z|    [0.025   0.975]
-------------------------------------------------------------------
Type           -0.2866    0.0394  -7.2799  0.0000  -0.3638  -0.2095
Age             0.0138    0.0012  11.1321  0.0000   0.0114   0.0163
Breed1          0.0033    0.0003  10.9598  0.0000   0.0027   0.0039
Breed2          0.0004    0.0002   2.2699  0.0232   0.0000   0.0007
Gender          0.1316    0.0311   4.2334  0.0000   0.0707   0.1925
Color1         -0.0401    0.0115  -3.4754  0.0005  -0.0627  -0.0175
Color2         -0.0072    0.0071  -1.0228  0.3064  -0.0211   0.0066
Color3         -0.0091    0.0070  -1.2863  0.1983  -0.0228   0.0047
MaturitySize    0.0575    0.0324   1.7753  0.0758  -0.0060   0.1209
FurLength      -0.3222    0.0325  -9.9093  0.0000  -0.3859  -0.2584
Vaccinated     -0.1728    0.0425  -4.0618  0.0000  -0.2561  -0.0894
Dewormed        0.1172    0.0397   2.9514  0.0032   0.0394   0.1950
Sterilized     -0.2346    0.0373  -6.2878  0.0000  -0.3078  -0.1615
Health          0.1342    0.0833   1.6101  0.1074  -0.0292   0.2975
Quantity        0.0477    0.0154   3.0987  0.0019   0.0175   0.0778
Fee             0.0002    0.0003   0.5974  0.5502  -0.0003   0.0007
VideoAmt        0.0194    0.0584   0.3314  0.7403  -0.0952   0.1339
PhotoAmt       -0.0035    0.0056  -0.6261  0.5312  -0.0144   0.0075
==================================================================

The p -values for most of the variables are smaller than 0.05, except 7 of them, therefore, we will remove them.

In [121]:
cols=['Type',
 'Age',
 'Breed1',
 'Breed2',
 'Gender',
 'Color1',
 'FurLength',
 'Vaccinated',
 'Dewormed',
 'Sterilized',
 'Quantity']
X1=X_train[cols]
y1=y_train

logit_model=sm.Logit(y1,X1)
result=logit_model.fit()
print(result.summary2())
Optimization terminated successfully.
         Current function value: 0.670877
         Iterations 5
                          Results: Logit
==================================================================
Model:              Logit            Pseudo R-squared: 0.032      
Dependent Variable: AdoptionSpeed    AIC:              16114.9987 
Date:               2020-09-29 20:21 BIC:              16196.3124 
No. Observations:   11994            Log-Likelihood:   -8046.5    
Df Model:           10               LL-Null:          -8313.6    
Df Residuals:       11983            LLR p-value:      2.2162e-108
Converged:          1.0000           Scale:            1.0000     
No. Iterations:     5.0000                                        
-------------------------------------------------------------------
                Coef.   Std.Err.     z     P>|z|    [0.025   0.975]
-------------------------------------------------------------------
Type           -0.2982    0.0360  -8.2782  0.0000  -0.3688  -0.2276
Age             0.0149    0.0012  12.6718  0.0000   0.0126   0.0172
Breed1          0.0037    0.0003  14.0718  0.0000   0.0031   0.0042
Breed2          0.0004    0.0002   2.5473  0.0109   0.0001   0.0007
Gender          0.1366    0.0304   4.4891  0.0000   0.0770   0.1963
Color1         -0.0318    0.0109  -2.9228  0.0035  -0.0531  -0.0105
FurLength      -0.3019    0.0313  -9.6550  0.0000  -0.3632  -0.2406
Vaccinated     -0.1686    0.0424  -3.9799  0.0001  -0.2517  -0.0856
Dewormed        0.1193    0.0396   3.0161  0.0026   0.0418   0.1968
Sterilized     -0.2129    0.0362  -5.8802  0.0000  -0.2838  -0.1419
Quantity        0.0409    0.0150   2.7251  0.0064   0.0115   0.0702
==================================================================

In [122]:
from sklearn import metrics

X_train,X_test,y_train,y_test=train_test_split(X1,y1,test_size=0.2,random_state=0)
logreg=LogisticRegression()
logreg.fit(X_train,y_train)
Out[122]:
LogisticRegression()
In [124]:
#predicting the test_set results and calculating accuracy
y_pred=logreg.predict(X_test)
print('Accuracy of logistic regression classifier on test set: {:.2f}'.format(logreg.score(X_test,y_test)))
Accuracy of logistic regression classifier on test set: 0.58
In [125]:
logistic_regression= LogisticRegression()
logistic_regression.fit(X_train,y_train)
y_pred=logistic_regression.predict(X_test)
In [113]:
cr_logistic_regression=classification_report(y_test,y_pred)
cm_logistic_regression=confusion_matrix(y_test,y_pred)
print(cr_logistic_regression)
print (cm_logistic_regression)
              precision    recall  f1-score   support

           0       0.59      0.60      0.59      1525
           1       0.58      0.57      0.57      1474

    accuracy                           0.58      2999
   macro avg       0.58      0.58      0.58      2999
weighted avg       0.58      0.58      0.58      2999

[[914 611]
 [635 839]]

Gaussian Naive Bayes

In [142]:
gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)
In [143]:
cr_gnb=classification_report(y_test,y_pred)
cm_gnb=confusion_matrix(y_test,y_pred)
print(cr_gnb)
print (cm_gnb)
              precision    recall  f1-score   support

           0       0.56      0.77      0.65      1507
           1       0.63      0.40      0.49      1508

    accuracy                           0.58      3015
   macro avg       0.59      0.58      0.57      3015
weighted avg       0.59      0.58      0.57      3015

[[1153  354]
 [ 909  599]]

Decision Tree Classifier

In [144]:
from sklearn.tree import DecisionTreeClassifier
clf_gini = DecisionTreeClassifier()
clf_gini = clf_gini.fit(X_train,y_train)
y_pred = clf_gini.predict(X_test)

cr_decision_tree_gini=classification_report(y_test,y_pred)
cm_decision_tree_gini=confusion_matrix(y_test,y_pred)
print(cr_decision_tree_gini)
print (cm_decision_tree_gini)
print("Gini decision tree depth: ",clf_gini.get_depth())

clf_entropy = DecisionTreeClassifier(criterion='entropy')
clf_entropy = clf_entropy.fit(X_train,y_train)
y_pred = clf_entropy.predict(X_test)

cr_decision_tree_entropy=classification_report(y_test,y_pred)
cm_decision_tree_entropy=confusion_matrix(y_test,y_pred)
print(cr_decision_tree_entropy)
print (cm_decision_tree_entropy)
print("Entropy decision tree depth: ",clf_entropy.get_depth())
              precision    recall  f1-score   support

           0       0.58      0.59      0.59      1507
           1       0.59      0.57      0.58      1508

    accuracy                           0.58      3015
   macro avg       0.58      0.58      0.58      3015
weighted avg       0.58      0.58      0.58      3015

[[895 612]
 [643 865]]
Gini decision tree depth:  36
              precision    recall  f1-score   support

           0       0.58      0.60      0.59      1507
           1       0.59      0.57      0.58      1508

    accuracy                           0.58      3015
   macro avg       0.58      0.58      0.58      3015
weighted avg       0.58      0.58      0.58      3015

[[900 607]
 [649 859]]
Entropy decision tree depth:  44
In [145]:
max_depth = []
acc_gini = []
acc_entropy = []
f1_gini =[]
f1_entropy = []
for i in range(1,30):
 dtree = DecisionTreeClassifier(criterion='gini', max_depth=i)
 dtree.fit(X_train, y_train)
 pred = dtree.predict(X_test)
 acc_gini.append(accuracy_score(y_test, pred)*100)
 f1_gini.append(f1_score(y_test, pred)*100)
 ####
 dtree = DecisionTreeClassifier(criterion='entropy', max_depth=i)
 dtree.fit(X_train, y_train)
 pred = dtree.predict(X_test)
 acc_entropy.append(accuracy_score(y_test, pred)*100)
 f1_entropy.append(f1_score(y_test, pred)*100)
 ####
 max_depth.append(i)
d = pd.DataFrame({'acc_gini':pd.Series(acc_gini), 
 'acc_entropy':pd.Series(acc_entropy),
 'f1_gini':pd.Series(f1_gini),                 
 'f1_entropy':pd.Series(f1_entropy),                 
 'max_depth':pd.Series(max_depth)})
# visualizing changes in parameters
plt.plot('max_depth','acc_gini', data=d, label='Accuracy Gini')
plt.plot('max_depth','acc_entropy', data=d, label='Accuracy Entropy')
plt.plot('max_depth','f1_gini', data=d, label='F1 Gini')
plt.plot('max_depth','f1_entropy', data=d, label='F1 Entropy')
plt.xlabel('max_depth')
plt.ylabel('accuracy/gini (%)')
plt.legend();

print("Maximum Gini tree accuracy score on the test data: ",max(acc_gini)," at max depth: ",np.argmax(acc_gini)+1)
print("Maximum Gini f1 score on the test data: ",max(f1_gini)," at max depth: ",np.argmax(f1_gini)+1)
print("Maximum Entropy tree accuracy score on the test data: ",max(acc_entropy)," at max depth: ",np.argmax(acc_entropy)+1)
print("Maximum Entropy f1 score on the test data: ",max(f1_entropy)," at max depth: ",np.argmax(f1_entropy)+1)
Maximum Gini tree accuracy score on the test data:  63.98009950248756  at max depth:  7
Maximum Gini f1 score on the test data:  63.86872830545913  at max depth:  3
Maximum Entropy tree accuracy score on the test data:  63.78109452736318  at max depth:  8
Maximum Entropy f1 score on the test data:  63.86872830545913  at max depth:  3
In [146]:
# we find out that the best Tree Classifier is the one witch uses Gini index and has maximum depth 7

clf = DecisionTreeClassifier(max_depth=7)
clf = clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)

cr_dt=classification_report(y_test,y_pred)
cm_dt=confusion_matrix(y_test,y_pred)
print(cr_dt)
print (cm_dt)
              precision    recall  f1-score   support

           0       0.62      0.70      0.66      1507
           1       0.66      0.58      0.62      1508

    accuracy                           0.64      3015
   macro avg       0.64      0.64      0.64      3015
weighted avg       0.64      0.64      0.64      3015

[[1055  452]
 [ 636  872]]

KNN classifier

In [147]:
accuracy_scores = []
f1_scores = []

neighbors = range(2, 20)
knn = KNeighborsClassifier()

for i in neighbors:
    knn.set_params(n_neighbors = i)
    knn.fit(X_train, y_train)
    y_pred=knn.predict(X_test)    
    accuracy_scores.append(accuracy_score(y_test,y_pred)*100)
    f1_scores.append(f1_score(y_test,y_pred)*100)
In [148]:
plt.plot(neighbors,accuracy_scores , label="Accuracy score")
plt.plot(neighbors, f1_scores, label="F1 score")
plt.xticks(np.arange(1, 21, 1))
plt.xlabel("Number of neighbors")
plt.ylabel("Model score")
plt.legend()

print(f"Maximum KNN accuracy score on the test data: {max(accuracy_scores):.2f}%")
print(f"Maximum KNN f1 score on the test data: {max(f1_scores):.2f}%")
Maximum KNN accuracy score on the test data: 60.93%
Maximum KNN f1 score on the test data: 59.51%
In [149]:
# best for n_neghboours = 2
knn = KNeighborsClassifier(n_neighbors=2)
knn.fit(X_train, y_train)
y_pred=knn.predict(X_test)    

cr_knn=classification_report(y_test,y_pred)
cm_knn=confusion_matrix(y_test,y_pred)
print(cr_knn)
print (cm_knn)
              precision    recall  f1-score   support

           0       0.56      0.81      0.66      1507
           1       0.65      0.35      0.46      1508

    accuracy                           0.58      3015
   macro avg       0.60      0.58      0.56      3015
weighted avg       0.60      0.58      0.56      3015

[[1227  280]
 [ 981  527]]

Random Forest Classifier

In [150]:
random_forest = RandomForestClassifier(max_depth=10, random_state=0)
random_forest.fit(X_train,y_train)
y_pred=random_forest.predict(X_test)

cr_rf=classification_report(y_test,y_pred)
cm_rf=confusion_matrix(y_test,y_pred)
print(cr_rf)
print (cm_rf)
              precision    recall  f1-score   support

           0       0.64      0.69      0.66      1507
           1       0.66      0.61      0.63      1508

    accuracy                           0.65      3015
   macro avg       0.65      0.65      0.65      3015
weighted avg       0.65      0.65      0.65      3015

[[1035  472]
 [ 588  920]]

Neural networks

After the machine learning methods were tested, here is a testing with a Deep Learning technique.

In [94]:
# Random seed for reproducibility
np.random.seed(42)

# Split into train & test set
X_train, X_test, y_train, y_test = train_test_split(X, # independent variables 
                                                    y, # dependent variable
                                                    test_size = 0.2) # percentage of data to use for test set
In [95]:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.optimizers import Adam, SGD, Nadam
from keras.regularizers import l1, l1_l2, l2
from keras.layers.normalization import BatchNormalization
from keras.callbacks import EarlyStopping
from keras import backend as K

def rmse_dl(y_true, y_pred):
        return K.sqrt(K.mean(K.square(y_pred - y_true))) 


regressor = Sequential()
regressor.add(Dense(units=6, input_dim=X_train.shape[1],activation='linear',  kernel_initializer='random_uniform'))
regressor.add(BatchNormalization())
regressor.add(Dropout(0.2))
# if we want to reduce rmse/increase training error, uncomment this
#regressor.add(Dense(units=10, activation='relu', kernel_initializer='random_uniform',activity_regularizer=l1_l2(0.5, 0.5)))
#regressor.add(BatchNormalization())
regressor.add(Dense(units=5, activation='linear', kernel_initializer='random_uniform', activity_regularizer=l1(6.8)))
regressor.add(BatchNormalization())
regressor.add(Dense(units=1))

#decay=0.000002
adam = Nadam(lr=0.0005, schedule_decay=0.004)
regressor.compile(optimizer=adam, loss='mse', metrics=[rmse_dl])
#EarlyStopping(monitor='val_loss',min_delta=1e-5, patience=57)
regressor.fit(X_train, y_train, batch_size=32, epochs=550, shuffle=True, 
              validation_split=0.1, verbose=1)
sgd = SGD(lr=0.0000002, decay=0.000001,momentum=0.7 ,nesterov=True)
regressor.compile(optimizer=sgd, loss='mse', metrics=[rmse_dl])
regressor.fit(X_train, y_train, batch_size=32, epochs=25, shuffle=True, 
              validation_split=0.1, verbose=1)
predicted = regressor.predict(X_test)
print("RMSE Neural Network log final: %.3f"
      % np.sqrt(mean_squared_error(y_test, predicted)))
print("RMSE Neural Network final: %.3f"
      % np.sqrt(mean_squared_error(np.expm1(y_test), np.expm1(predicted))))
Epoch 1/550
338/338 [==============================] - ETA: 0s - loss: 6.8028 - rmse_dl: 2.4993WARNING:tensorflow:Callbacks method `on_test_batch_begin` is slow compared to the batch time (batch time: 0.0030s vs `on_test_batch_begin` time: 0.0093s). Check your callbacks.
338/338 [==============================] - 4s 11ms/step - loss: 6.8028 - rmse_dl: 2.4993 - val_loss: 4.5238 - val_rmse_dl: 2.1088
Epoch 2/550
338/338 [==============================] - 1s 4ms/step - loss: 3.3826 - rmse_dl: 1.8055 - val_loss: 2.2734 - val_rmse_dl: 1.4882
Epoch 3/550
338/338 [==============================] - 1s 4ms/step - loss: 1.7450 - rmse_dl: 1.2957 - val_loss: 1.4323 - val_rmse_dl: 1.1800
Epoch 4/550
338/338 [==============================] - 1s 3ms/step - loss: 1.4184 - rmse_dl: 1.1720 - val_loss: 1.3625 - val_rmse_dl: 1.1480
Epoch 5/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3877 - rmse_dl: 1.1583 - val_loss: 1.4354 - val_rmse_dl: 1.1736
Epoch 6/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3782 - rmse_dl: 1.1540 - val_loss: 1.3673 - val_rmse_dl: 1.1523
Epoch 7/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3731 - rmse_dl: 1.1506 - val_loss: 1.3483 - val_rmse_dl: 1.1456
Epoch 8/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3738 - rmse_dl: 1.1515 - val_loss: 1.4113 - val_rmse_dl: 1.1604
Epoch 9/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3681 - rmse_dl: 1.1488 - val_loss: 1.3407 - val_rmse_dl: 1.1458
Epoch 10/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3670 - rmse_dl: 1.1487 - val_loss: 1.3691 - val_rmse_dl: 1.1500
Epoch 11/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3643 - rmse_dl: 1.1479 - val_loss: 1.3592 - val_rmse_dl: 1.1465
Epoch 12/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3640 - rmse_dl: 1.1481 - val_loss: 1.3814 - val_rmse_dl: 1.1559
Epoch 13/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3631 - rmse_dl: 1.1489 - val_loss: 1.3549 - val_rmse_dl: 1.1438
Epoch 14/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3572 - rmse_dl: 1.1453 - val_loss: 1.3435 - val_rmse_dl: 1.1476
Epoch 15/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3513 - rmse_dl: 1.1442 - val_loss: 1.3686 - val_rmse_dl: 1.1521
Epoch 16/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3494 - rmse_dl: 1.1410 - val_loss: 1.3551 - val_rmse_dl: 1.1468
Epoch 17/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3535 - rmse_dl: 1.1451 - val_loss: 1.3470 - val_rmse_dl: 1.1424
Epoch 18/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3489 - rmse_dl: 1.1437 - val_loss: 1.3511 - val_rmse_dl: 1.1477
Epoch 19/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3494 - rmse_dl: 1.1439 - val_loss: 1.3929 - val_rmse_dl: 1.1583
Epoch 20/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3486 - rmse_dl: 1.1431 - val_loss: 1.3439 - val_rmse_dl: 1.1425
Epoch 21/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3452 - rmse_dl: 1.1422 - val_loss: 1.3383 - val_rmse_dl: 1.1405
Epoch 22/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3454 - rmse_dl: 1.1421 - val_loss: 1.3617 - val_rmse_dl: 1.1488
Epoch 23/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3363 - rmse_dl: 1.1391 - val_loss: 1.3502 - val_rmse_dl: 1.1471
Epoch 24/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3404 - rmse_dl: 1.1413 - val_loss: 1.3618 - val_rmse_dl: 1.1512
Epoch 25/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3417 - rmse_dl: 1.1416 - val_loss: 1.3436 - val_rmse_dl: 1.1440
Epoch 26/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3399 - rmse_dl: 1.1408 - val_loss: 1.3506 - val_rmse_dl: 1.1421
Epoch 27/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3381 - rmse_dl: 1.1413 - val_loss: 1.3471 - val_rmse_dl: 1.1440
Epoch 28/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3365 - rmse_dl: 1.1395 - val_loss: 1.3552 - val_rmse_dl: 1.1470
Epoch 29/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3370 - rmse_dl: 1.1415 - val_loss: 1.3536 - val_rmse_dl: 1.1469
Epoch 30/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3309 - rmse_dl: 1.1380 - val_loss: 1.3468 - val_rmse_dl: 1.1472
Epoch 31/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3333 - rmse_dl: 1.1397 - val_loss: 1.3311 - val_rmse_dl: 1.1383
Epoch 32/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3336 - rmse_dl: 1.1403 - val_loss: 1.3362 - val_rmse_dl: 1.1433
Epoch 33/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3297 - rmse_dl: 1.1380 - val_loss: 1.3477 - val_rmse_dl: 1.1440
Epoch 34/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3321 - rmse_dl: 1.1397 - val_loss: 1.3358 - val_rmse_dl: 1.1430
Epoch 35/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3317 - rmse_dl: 1.1397 - val_loss: 1.3371 - val_rmse_dl: 1.1437
Epoch 36/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3290 - rmse_dl: 1.1387 - val_loss: 1.3369 - val_rmse_dl: 1.1423
Epoch 37/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3305 - rmse_dl: 1.1399 - val_loss: 1.3308 - val_rmse_dl: 1.1407
Epoch 38/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3272 - rmse_dl: 1.1391 - val_loss: 1.3393 - val_rmse_dl: 1.1419
Epoch 39/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3272 - rmse_dl: 1.1389 - val_loss: 1.3324 - val_rmse_dl: 1.1426
Epoch 40/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3272 - rmse_dl: 1.1395 - val_loss: 1.3304 - val_rmse_dl: 1.1397
Epoch 41/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3279 - rmse_dl: 1.1381 - val_loss: 1.3414 - val_rmse_dl: 1.1456
Epoch 42/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3241 - rmse_dl: 1.1378 - val_loss: 1.3339 - val_rmse_dl: 1.1422
Epoch 43/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3207 - rmse_dl: 1.1359 - val_loss: 1.3417 - val_rmse_dl: 1.1451
Epoch 44/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3273 - rmse_dl: 1.1394 - val_loss: 1.3300 - val_rmse_dl: 1.1401
Epoch 45/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3218 - rmse_dl: 1.1370 - val_loss: 1.3427 - val_rmse_dl: 1.1453
Epoch 46/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3258 - rmse_dl: 1.1387 - val_loss: 1.3287 - val_rmse_dl: 1.1407
Epoch 47/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3226 - rmse_dl: 1.1374 - val_loss: 1.3312 - val_rmse_dl: 1.1404
Epoch 48/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3223 - rmse_dl: 1.1373 - val_loss: 1.3399 - val_rmse_dl: 1.1458
Epoch 49/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3226 - rmse_dl: 1.1384 - val_loss: 1.3264 - val_rmse_dl: 1.1406
Epoch 50/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3210 - rmse_dl: 1.1367 - val_loss: 1.3285 - val_rmse_dl: 1.1413
Epoch 51/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3197 - rmse_dl: 1.1374 - val_loss: 1.3363 - val_rmse_dl: 1.1433
Epoch 52/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3210 - rmse_dl: 1.1368 - val_loss: 1.3508 - val_rmse_dl: 1.1484
Epoch 53/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3204 - rmse_dl: 1.1368 - val_loss: 1.3493 - val_rmse_dl: 1.1482
Epoch 54/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3171 - rmse_dl: 1.1353 - val_loss: 1.3525 - val_rmse_dl: 1.1499
Epoch 55/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3222 - rmse_dl: 1.1383 - val_loss: 1.3365 - val_rmse_dl: 1.1440
Epoch 56/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3207 - rmse_dl: 1.1385 - val_loss: 1.3456 - val_rmse_dl: 1.1478
Epoch 57/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3211 - rmse_dl: 1.1372 - val_loss: 1.3298 - val_rmse_dl: 1.1423
Epoch 58/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3221 - rmse_dl: 1.1385 - val_loss: 1.3362 - val_rmse_dl: 1.1444
Epoch 59/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3179 - rmse_dl: 1.1366 - val_loss: 1.3269 - val_rmse_dl: 1.1404
Epoch 60/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3192 - rmse_dl: 1.1373 - val_loss: 1.3219 - val_rmse_dl: 1.1388
Epoch 61/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3180 - rmse_dl: 1.1372 - val_loss: 1.3310 - val_rmse_dl: 1.1422
Epoch 62/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3173 - rmse_dl: 1.1372 - val_loss: 1.3332 - val_rmse_dl: 1.1437
Epoch 63/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3154 - rmse_dl: 1.1362 - val_loss: 1.3245 - val_rmse_dl: 1.1402
Epoch 64/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3184 - rmse_dl: 1.1367 - val_loss: 1.3219 - val_rmse_dl: 1.1391
Epoch 65/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3208 - rmse_dl: 1.1383 - val_loss: 1.3280 - val_rmse_dl: 1.1416
Epoch 66/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3197 - rmse_dl: 1.1382 - val_loss: 1.3326 - val_rmse_dl: 1.1445
Epoch 67/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3148 - rmse_dl: 1.1361 - val_loss: 1.3490 - val_rmse_dl: 1.1502
Epoch 68/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3172 - rmse_dl: 1.1373 - val_loss: 1.3311 - val_rmse_dl: 1.1432
Epoch 69/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3182 - rmse_dl: 1.1378 - val_loss: 1.3286 - val_rmse_dl: 1.1422
Epoch 70/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3203 - rmse_dl: 1.1381 - val_loss: 1.3336 - val_rmse_dl: 1.1434
Epoch 71/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3165 - rmse_dl: 1.1373 - val_loss: 1.3248 - val_rmse_dl: 1.1411
Epoch 72/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3200 - rmse_dl: 1.1390 - val_loss: 1.3311 - val_rmse_dl: 1.1427
Epoch 73/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3152 - rmse_dl: 1.1375 - val_loss: 1.3157 - val_rmse_dl: 1.1373
Epoch 74/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3131 - rmse_dl: 1.1361 - val_loss: 1.3414 - val_rmse_dl: 1.1480
Epoch 75/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3174 - rmse_dl: 1.1374 - val_loss: 1.3200 - val_rmse_dl: 1.1395
Epoch 76/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3176 - rmse_dl: 1.1377 - val_loss: 1.3222 - val_rmse_dl: 1.1400
Epoch 77/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3181 - rmse_dl: 1.1381 - val_loss: 1.3667 - val_rmse_dl: 1.1567
Epoch 78/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3168 - rmse_dl: 1.1378 - val_loss: 1.3188 - val_rmse_dl: 1.1382
Epoch 79/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3142 - rmse_dl: 1.1371 - val_loss: 1.3350 - val_rmse_dl: 1.1452
Epoch 80/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3135 - rmse_dl: 1.1355 - val_loss: 1.3350 - val_rmse_dl: 1.1452
Epoch 81/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3135 - rmse_dl: 1.1360 - val_loss: 1.3192 - val_rmse_dl: 1.1389
Epoch 82/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3182 - rmse_dl: 1.1393 - val_loss: 1.3411 - val_rmse_dl: 1.1477
Epoch 83/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3142 - rmse_dl: 1.1370 - val_loss: 1.3239 - val_rmse_dl: 1.1412
Epoch 84/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3161 - rmse_dl: 1.1369 - val_loss: 1.3264 - val_rmse_dl: 1.1424
Epoch 85/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3150 - rmse_dl: 1.1378 - val_loss: 1.3447 - val_rmse_dl: 1.1484
Epoch 86/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3147 - rmse_dl: 1.1372 - val_loss: 1.3244 - val_rmse_dl: 1.1414
Epoch 87/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3151 - rmse_dl: 1.1372 - val_loss: 1.3237 - val_rmse_dl: 1.1409
Epoch 88/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3151 - rmse_dl: 1.1369 - val_loss: 1.3276 - val_rmse_dl: 1.1428
Epoch 89/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3147 - rmse_dl: 1.1371 - val_loss: 1.3267 - val_rmse_dl: 1.1420
Epoch 90/550
338/338 [==============================] - 2s 5ms/step - loss: 1.3180 - rmse_dl: 1.1386 - val_loss: 1.3345 - val_rmse_dl: 1.1448
Epoch 91/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3165 - rmse_dl: 1.1372 - val_loss: 1.3244 - val_rmse_dl: 1.1410
Epoch 92/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3158 - rmse_dl: 1.1374 - val_loss: 1.3275 - val_rmse_dl: 1.1421
Epoch 93/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3130 - rmse_dl: 1.1360 - val_loss: 1.3269 - val_rmse_dl: 1.1421
Epoch 94/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3134 - rmse_dl: 1.1370 - val_loss: 1.3241 - val_rmse_dl: 1.1406
Epoch 95/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3128 - rmse_dl: 1.1363 - val_loss: 1.3337 - val_rmse_dl: 1.1452
Epoch 96/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3149 - rmse_dl: 1.1362 - val_loss: 1.3232 - val_rmse_dl: 1.1408
Epoch 97/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3124 - rmse_dl: 1.1368 - val_loss: 1.3407 - val_rmse_dl: 1.1480
Epoch 98/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3127 - rmse_dl: 1.1372 - val_loss: 1.3259 - val_rmse_dl: 1.1415
Epoch 99/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3139 - rmse_dl: 1.1370 - val_loss: 1.3188 - val_rmse_dl: 1.1394
Epoch 100/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3131 - rmse_dl: 1.1374 - val_loss: 1.3268 - val_rmse_dl: 1.1423
Epoch 101/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3152 - rmse_dl: 1.1382 - val_loss: 1.3576 - val_rmse_dl: 1.1549
Epoch 102/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3129 - rmse_dl: 1.1364 - val_loss: 1.3656 - val_rmse_dl: 1.1583
Epoch 103/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3170 - rmse_dl: 1.1388 - val_loss: 1.3268 - val_rmse_dl: 1.1424
Epoch 104/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3148 - rmse_dl: 1.1384 - val_loss: 1.3283 - val_rmse_dl: 1.1427
Epoch 105/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3143 - rmse_dl: 1.1376 - val_loss: 1.3243 - val_rmse_dl: 1.1417
Epoch 106/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3138 - rmse_dl: 1.1375 - val_loss: 1.3769 - val_rmse_dl: 1.1618
Epoch 107/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3160 - rmse_dl: 1.1386 - val_loss: 1.3387 - val_rmse_dl: 1.1470
Epoch 108/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3081 - rmse_dl: 1.1344 - val_loss: 1.3346 - val_rmse_dl: 1.1450
Epoch 109/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3121 - rmse_dl: 1.1356 - val_loss: 1.3164 - val_rmse_dl: 1.1383
Epoch 110/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3145 - rmse_dl: 1.1378 - val_loss: 1.3212 - val_rmse_dl: 1.1409
Epoch 111/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3107 - rmse_dl: 1.1360 - val_loss: 1.3342 - val_rmse_dl: 1.1454
Epoch 112/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3153 - rmse_dl: 1.1384 - val_loss: 1.3315 - val_rmse_dl: 1.1448
Epoch 113/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3130 - rmse_dl: 1.1376 - val_loss: 1.3179 - val_rmse_dl: 1.1394
Epoch 114/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3093 - rmse_dl: 1.1353 - val_loss: 1.3530 - val_rmse_dl: 1.1528
Epoch 115/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3150 - rmse_dl: 1.1377 - val_loss: 1.3222 - val_rmse_dl: 1.1408
Epoch 116/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3141 - rmse_dl: 1.1378 - val_loss: 1.3400 - val_rmse_dl: 1.1478
Epoch 117/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3130 - rmse_dl: 1.1369 - val_loss: 1.3273 - val_rmse_dl: 1.1426
Epoch 118/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3132 - rmse_dl: 1.1376 - val_loss: 1.3341 - val_rmse_dl: 1.1449
Epoch 119/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3102 - rmse_dl: 1.1357 - val_loss: 1.3536 - val_rmse_dl: 1.1533
Epoch 120/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3087 - rmse_dl: 1.1346 - val_loss: 1.3321 - val_rmse_dl: 1.1451
Epoch 121/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3117 - rmse_dl: 1.1369 - val_loss: 1.3247 - val_rmse_dl: 1.1419
Epoch 122/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3117 - rmse_dl: 1.1371 - val_loss: 1.3203 - val_rmse_dl: 1.1400
Epoch 123/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3128 - rmse_dl: 1.1375 - val_loss: 1.3210 - val_rmse_dl: 1.1401
Epoch 124/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3137 - rmse_dl: 1.1373 - val_loss: 1.3244 - val_rmse_dl: 1.1424
Epoch 125/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3117 - rmse_dl: 1.1364 - val_loss: 1.3210 - val_rmse_dl: 1.1403
Epoch 126/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3106 - rmse_dl: 1.1361 - val_loss: 1.3258 - val_rmse_dl: 1.1420
Epoch 127/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3129 - rmse_dl: 1.1377 - val_loss: 1.3301 - val_rmse_dl: 1.1435
Epoch 128/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3108 - rmse_dl: 1.1363 - val_loss: 1.3196 - val_rmse_dl: 1.1406
Epoch 129/550
338/338 [==============================] - ETA: 0s - loss: 1.3176 - rmse_dl: 1.139 - 1s 4ms/step - loss: 1.3110 - rmse_dl: 1.1370 - val_loss: 1.3237 - val_rmse_dl: 1.1410
Epoch 130/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3082 - rmse_dl: 1.1348 - val_loss: 1.3194 - val_rmse_dl: 1.1399
Epoch 131/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3095 - rmse_dl: 1.1354 - val_loss: 1.3297 - val_rmse_dl: 1.1433
Epoch 132/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3146 - rmse_dl: 1.1384 - val_loss: 1.3294 - val_rmse_dl: 1.1431
Epoch 133/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3114 - rmse_dl: 1.1365 - val_loss: 1.3265 - val_rmse_dl: 1.1425
Epoch 134/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3126 - rmse_dl: 1.1373 - val_loss: 1.3233 - val_rmse_dl: 1.1413
Epoch 135/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3135 - rmse_dl: 1.1379 - val_loss: 1.3285 - val_rmse_dl: 1.1434
Epoch 136/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3098 - rmse_dl: 1.1355 - val_loss: 1.3171 - val_rmse_dl: 1.1397
Epoch 137/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3113 - rmse_dl: 1.1352 - val_loss: 1.3299 - val_rmse_dl: 1.1440
Epoch 138/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3105 - rmse_dl: 1.1372 - val_loss: 1.3433 - val_rmse_dl: 1.1490
Epoch 139/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3115 - rmse_dl: 1.1369 - val_loss: 1.3371 - val_rmse_dl: 1.1474
Epoch 140/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3151 - rmse_dl: 1.1395 - val_loss: 1.3388 - val_rmse_dl: 1.1478
Epoch 141/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3077 - rmse_dl: 1.1351 - val_loss: 1.3326 - val_rmse_dl: 1.1450
Epoch 142/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3095 - rmse_dl: 1.1356 - val_loss: 1.3332 - val_rmse_dl: 1.1454
Epoch 143/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3131 - rmse_dl: 1.1378 - val_loss: 1.3373 - val_rmse_dl: 1.1466
Epoch 144/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3105 - rmse_dl: 1.1366 - val_loss: 1.3301 - val_rmse_dl: 1.1441
Epoch 145/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3081 - rmse_dl: 1.1346 - val_loss: 1.3183 - val_rmse_dl: 1.1395
Epoch 146/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3119 - rmse_dl: 1.1368 - val_loss: 1.3261 - val_rmse_dl: 1.1427
Epoch 147/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3149 - rmse_dl: 1.1385 - val_loss: 1.3292 - val_rmse_dl: 1.1441
Epoch 148/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3126 - rmse_dl: 1.1367 - val_loss: 1.3168 - val_rmse_dl: 1.1395
Epoch 149/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3123 - rmse_dl: 1.1379 - val_loss: 1.3214 - val_rmse_dl: 1.1411
Epoch 150/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3145 - rmse_dl: 1.1386 - val_loss: 1.3208 - val_rmse_dl: 1.1405
Epoch 151/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3098 - rmse_dl: 1.1360 - val_loss: 1.3318 - val_rmse_dl: 1.1451
Epoch 152/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3115 - rmse_dl: 1.1371 - val_loss: 1.3302 - val_rmse_dl: 1.1446
Epoch 153/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3115 - rmse_dl: 1.1366 - val_loss: 1.3154 - val_rmse_dl: 1.1392
Epoch 154/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3133 - rmse_dl: 1.1378 - val_loss: 1.3341 - val_rmse_dl: 1.1455
Epoch 155/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3098 - rmse_dl: 1.1357 - val_loss: 1.3326 - val_rmse_dl: 1.1454
Epoch 156/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3128 - rmse_dl: 1.1372 - val_loss: 1.3215 - val_rmse_dl: 1.1412
Epoch 157/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3115 - rmse_dl: 1.1370 - val_loss: 1.3167 - val_rmse_dl: 1.1394
Epoch 158/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3134 - rmse_dl: 1.1374 - val_loss: 1.3201 - val_rmse_dl: 1.1401
Epoch 159/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3087 - rmse_dl: 1.1362 - val_loss: 1.3285 - val_rmse_dl: 1.1436
Epoch 160/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3081 - rmse_dl: 1.1357 - val_loss: 1.3377 - val_rmse_dl: 1.1465
Epoch 161/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3139 - rmse_dl: 1.1384 - val_loss: 1.3387 - val_rmse_dl: 1.1478
Epoch 162/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3113 - rmse_dl: 1.1371 - val_loss: 1.3188 - val_rmse_dl: 1.1396
Epoch 163/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3077 - rmse_dl: 1.1354 - val_loss: 1.3352 - val_rmse_dl: 1.1463
Epoch 164/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3102 - rmse_dl: 1.1371 - val_loss: 1.3241 - val_rmse_dl: 1.1423
Epoch 165/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3089 - rmse_dl: 1.1342 - val_loss: 1.3251 - val_rmse_dl: 1.1431
Epoch 166/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3100 - rmse_dl: 1.1363 - val_loss: 1.3316 - val_rmse_dl: 1.1451
Epoch 167/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3070 - rmse_dl: 1.1346 - val_loss: 1.3222 - val_rmse_dl: 1.1417
Epoch 168/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3127 - rmse_dl: 1.1371 - val_loss: 1.3427 - val_rmse_dl: 1.1491
Epoch 169/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3115 - rmse_dl: 1.1365 - val_loss: 1.3325 - val_rmse_dl: 1.1453
Epoch 170/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3146 - rmse_dl: 1.1381 - val_loss: 1.3370 - val_rmse_dl: 1.1472
Epoch 171/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3095 - rmse_dl: 1.1366 - val_loss: 1.3371 - val_rmse_dl: 1.1476
Epoch 172/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3103 - rmse_dl: 1.1366 - val_loss: 1.3398 - val_rmse_dl: 1.1483
Epoch 173/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3080 - rmse_dl: 1.1355 - val_loss: 1.3369 - val_rmse_dl: 1.1473
Epoch 174/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3119 - rmse_dl: 1.1373 - val_loss: 1.3233 - val_rmse_dl: 1.1420
Epoch 175/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3104 - rmse_dl: 1.1371 - val_loss: 1.3184 - val_rmse_dl: 1.1398
Epoch 176/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3085 - rmse_dl: 1.1362 - val_loss: 1.3135 - val_rmse_dl: 1.1379
Epoch 177/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3076 - rmse_dl: 1.1351 - val_loss: 1.3397 - val_rmse_dl: 1.1485
Epoch 178/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3096 - rmse_dl: 1.1366 - val_loss: 1.3209 - val_rmse_dl: 1.1408
Epoch 179/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3111 - rmse_dl: 1.1371 - val_loss: 1.3336 - val_rmse_dl: 1.1457
Epoch 180/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3127 - rmse_dl: 1.1372 - val_loss: 1.3189 - val_rmse_dl: 1.1403
Epoch 181/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3107 - rmse_dl: 1.1362 - val_loss: 1.3300 - val_rmse_dl: 1.1446
Epoch 182/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3109 - rmse_dl: 1.1374 - val_loss: 1.3636 - val_rmse_dl: 1.1583
Epoch 183/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3126 - rmse_dl: 1.1375 - val_loss: 1.3169 - val_rmse_dl: 1.1395
Epoch 184/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3072 - rmse_dl: 1.1359 - val_loss: 1.3319 - val_rmse_dl: 1.1447
Epoch 185/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3077 - rmse_dl: 1.1354 - val_loss: 1.3305 - val_rmse_dl: 1.1440
Epoch 186/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3104 - rmse_dl: 1.1367 - val_loss: 1.3470 - val_rmse_dl: 1.1516
Epoch 187/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3099 - rmse_dl: 1.1358 - val_loss: 1.3213 - val_rmse_dl: 1.1410
Epoch 188/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3112 - rmse_dl: 1.1357 - val_loss: 1.3193 - val_rmse_dl: 1.1404
Epoch 189/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3113 - rmse_dl: 1.1370 - val_loss: 1.3214 - val_rmse_dl: 1.1410
Epoch 190/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3073 - rmse_dl: 1.1355 - val_loss: 1.3217 - val_rmse_dl: 1.1412
Epoch 191/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3050 - rmse_dl: 1.1343 - val_loss: 1.3228 - val_rmse_dl: 1.1417
Epoch 192/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3088 - rmse_dl: 1.1359 - val_loss: 1.3364 - val_rmse_dl: 1.1475
Epoch 193/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3088 - rmse_dl: 1.1364 - val_loss: 1.3337 - val_rmse_dl: 1.1461
Epoch 194/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3109 - rmse_dl: 1.1372 - val_loss: 1.3143 - val_rmse_dl: 1.1388
Epoch 195/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3098 - rmse_dl: 1.1375 - val_loss: 1.3147 - val_rmse_dl: 1.1394
Epoch 196/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3095 - rmse_dl: 1.1365 - val_loss: 1.3185 - val_rmse_dl: 1.1399
Epoch 197/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3087 - rmse_dl: 1.1358 - val_loss: 1.3634 - val_rmse_dl: 1.1582
Epoch 198/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3120 - rmse_dl: 1.1369 - val_loss: 1.3182 - val_rmse_dl: 1.1398
Epoch 199/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3109 - rmse_dl: 1.1371 - val_loss: 1.3336 - val_rmse_dl: 1.1458
Epoch 200/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3151 - rmse_dl: 1.1396 - val_loss: 1.3172 - val_rmse_dl: 1.1393
Epoch 201/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3151 - rmse_dl: 1.1379 - val_loss: 1.3198 - val_rmse_dl: 1.1409
Epoch 202/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3118 - rmse_dl: 1.1379 - val_loss: 1.3336 - val_rmse_dl: 1.1461
Epoch 203/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3118 - rmse_dl: 1.1371 - val_loss: 1.3267 - val_rmse_dl: 1.1433
Epoch 204/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3110 - rmse_dl: 1.1375 - val_loss: 1.3290 - val_rmse_dl: 1.1442
Epoch 205/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3086 - rmse_dl: 1.1367 - val_loss: 1.3209 - val_rmse_dl: 1.1407
Epoch 206/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3105 - rmse_dl: 1.1363 - val_loss: 1.3213 - val_rmse_dl: 1.1414
Epoch 207/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3098 - rmse_dl: 1.1365 - val_loss: 1.3285 - val_rmse_dl: 1.1443
Epoch 208/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3096 - rmse_dl: 1.1361 - val_loss: 1.3165 - val_rmse_dl: 1.1396
Epoch 209/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3105 - rmse_dl: 1.1374 - val_loss: 1.3305 - val_rmse_dl: 1.1452
Epoch 210/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3120 - rmse_dl: 1.1373 - val_loss: 1.3283 - val_rmse_dl: 1.1443
Epoch 211/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3159 - rmse_dl: 1.1394 - val_loss: 1.3488 - val_rmse_dl: 1.1522
Epoch 212/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3097 - rmse_dl: 1.1368 - val_loss: 1.3185 - val_rmse_dl: 1.1402
Epoch 213/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3064 - rmse_dl: 1.1356 - val_loss: 1.3267 - val_rmse_dl: 1.1436
Epoch 214/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3093 - rmse_dl: 1.1357 - val_loss: 1.3227 - val_rmse_dl: 1.1420
Epoch 215/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3105 - rmse_dl: 1.1356 - val_loss: 1.3247 - val_rmse_dl: 1.1424
Epoch 216/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3102 - rmse_dl: 1.1366 - val_loss: 1.3341 - val_rmse_dl: 1.1464
Epoch 217/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3108 - rmse_dl: 1.1371 - val_loss: 1.3215 - val_rmse_dl: 1.1413
Epoch 218/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3098 - rmse_dl: 1.1370 - val_loss: 1.3214 - val_rmse_dl: 1.1412
Epoch 219/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3089 - rmse_dl: 1.1366 - val_loss: 1.3288 - val_rmse_dl: 1.1440
Epoch 220/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3090 - rmse_dl: 1.1369 - val_loss: 1.3241 - val_rmse_dl: 1.1428
Epoch 221/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3142 - rmse_dl: 1.1383 - val_loss: 1.3227 - val_rmse_dl: 1.1418
Epoch 222/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3063 - rmse_dl: 1.1359 - val_loss: 1.3260 - val_rmse_dl: 1.1431
Epoch 223/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3097 - rmse_dl: 1.1372 - val_loss: 1.3146 - val_rmse_dl: 1.1391
Epoch 224/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3075 - rmse_dl: 1.1364 - val_loss: 1.3163 - val_rmse_dl: 1.1398
Epoch 225/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3073 - rmse_dl: 1.1351 - val_loss: 1.3254 - val_rmse_dl: 1.1433
Epoch 226/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3117 - rmse_dl: 1.1383 - val_loss: 1.3377 - val_rmse_dl: 1.1480
Epoch 227/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3122 - rmse_dl: 1.1384 - val_loss: 1.3314 - val_rmse_dl: 1.1447
Epoch 228/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3087 - rmse_dl: 1.1368 - val_loss: 1.3266 - val_rmse_dl: 1.1442
Epoch 229/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3072 - rmse_dl: 1.1357 - val_loss: 1.3197 - val_rmse_dl: 1.1412
Epoch 230/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3082 - rmse_dl: 1.1361 - val_loss: 1.3146 - val_rmse_dl: 1.1392
Epoch 231/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3100 - rmse_dl: 1.1363 - val_loss: 1.3170 - val_rmse_dl: 1.1397
Epoch 232/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3082 - rmse_dl: 1.1344 - val_loss: 1.3172 - val_rmse_dl: 1.1397
Epoch 233/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3091 - rmse_dl: 1.1359 - val_loss: 1.3242 - val_rmse_dl: 1.1425
Epoch 234/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3120 - rmse_dl: 1.1378 - val_loss: 1.3238 - val_rmse_dl: 1.1424
Epoch 235/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3074 - rmse_dl: 1.1355 - val_loss: 1.3265 - val_rmse_dl: 1.1436
Epoch 236/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3090 - rmse_dl: 1.1369 - val_loss: 1.3254 - val_rmse_dl: 1.1431
Epoch 237/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3060 - rmse_dl: 1.1350 - val_loss: 1.3211 - val_rmse_dl: 1.1414
Epoch 238/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3101 - rmse_dl: 1.1369 - val_loss: 1.3121 - val_rmse_dl: 1.1376
Epoch 239/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3100 - rmse_dl: 1.1370 - val_loss: 1.3326 - val_rmse_dl: 1.1459
Epoch 240/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3115 - rmse_dl: 1.1367 - val_loss: 1.3217 - val_rmse_dl: 1.1416
Epoch 241/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3118 - rmse_dl: 1.1385 - val_loss: 1.3199 - val_rmse_dl: 1.1411
Epoch 242/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3108 - rmse_dl: 1.1364 - val_loss: 1.3331 - val_rmse_dl: 1.1461
Epoch 243/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3096 - rmse_dl: 1.1376 - val_loss: 1.3156 - val_rmse_dl: 1.1400
Epoch 244/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3091 - rmse_dl: 1.1362 - val_loss: 1.3251 - val_rmse_dl: 1.1436
Epoch 245/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3092 - rmse_dl: 1.1371 - val_loss: 1.3292 - val_rmse_dl: 1.1446
Epoch 246/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3080 - rmse_dl: 1.1351 - val_loss: 1.3182 - val_rmse_dl: 1.1406
Epoch 247/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3095 - rmse_dl: 1.1361 - val_loss: 1.3267 - val_rmse_dl: 1.1431
Epoch 248/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3064 - rmse_dl: 1.1355 - val_loss: 1.3249 - val_rmse_dl: 1.1431
Epoch 249/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3119 - rmse_dl: 1.1374 - val_loss: 1.3263 - val_rmse_dl: 1.1434
Epoch 250/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3107 - rmse_dl: 1.1363 - val_loss: 1.3387 - val_rmse_dl: 1.1485
Epoch 251/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3068 - rmse_dl: 1.1361 - val_loss: 1.3377 - val_rmse_dl: 1.1477
Epoch 252/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3140 - rmse_dl: 1.1385 - val_loss: 1.3336 - val_rmse_dl: 1.1467
Epoch 253/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3051 - rmse_dl: 1.1352 - val_loss: 1.3278 - val_rmse_dl: 1.1437
Epoch 254/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3089 - rmse_dl: 1.1362 - val_loss: 1.3406 - val_rmse_dl: 1.1491
Epoch 255/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3074 - rmse_dl: 1.1352 - val_loss: 1.3180 - val_rmse_dl: 1.1399
Epoch 256/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3096 - rmse_dl: 1.1369 - val_loss: 1.3179 - val_rmse_dl: 1.1400
Epoch 257/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3100 - rmse_dl: 1.1363 - val_loss: 1.3147 - val_rmse_dl: 1.1393
Epoch 258/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3058 - rmse_dl: 1.1356 - val_loss: 1.3353 - val_rmse_dl: 1.1468
Epoch 259/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3109 - rmse_dl: 1.1369 - val_loss: 1.3269 - val_rmse_dl: 1.1440
Epoch 260/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3093 - rmse_dl: 1.1366 - val_loss: 1.3239 - val_rmse_dl: 1.1421
Epoch 261/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3070 - rmse_dl: 1.1354 - val_loss: 1.3164 - val_rmse_dl: 1.1397
Epoch 262/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3088 - rmse_dl: 1.1363 - val_loss: 1.3333 - val_rmse_dl: 1.1460
Epoch 263/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3097 - rmse_dl: 1.1358 - val_loss: 1.3366 - val_rmse_dl: 1.1474
Epoch 264/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3089 - rmse_dl: 1.1373 - val_loss: 1.3253 - val_rmse_dl: 1.1424
Epoch 265/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3099 - rmse_dl: 1.1368 - val_loss: 1.3205 - val_rmse_dl: 1.1411
Epoch 266/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3061 - rmse_dl: 1.1342 - val_loss: 1.3371 - val_rmse_dl: 1.1474
Epoch 267/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3080 - rmse_dl: 1.1355 - val_loss: 1.3171 - val_rmse_dl: 1.1395
Epoch 268/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3110 - rmse_dl: 1.1376 - val_loss: 1.3222 - val_rmse_dl: 1.1419
Epoch 269/550
338/338 [==============================] - ETA: 0s - loss: 1.3071 - rmse_dl: 1.136 - 1s 4ms/step - loss: 1.3060 - rmse_dl: 1.1358 - val_loss: 1.3225 - val_rmse_dl: 1.1424
Epoch 270/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3104 - rmse_dl: 1.1367 - val_loss: 1.3192 - val_rmse_dl: 1.1407
Epoch 271/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3112 - rmse_dl: 1.1369 - val_loss: 1.3348 - val_rmse_dl: 1.1465
Epoch 272/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3097 - rmse_dl: 1.1368 - val_loss: 1.3166 - val_rmse_dl: 1.1400
Epoch 273/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3109 - rmse_dl: 1.1368 - val_loss: 1.3298 - val_rmse_dl: 1.1453
Epoch 274/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3078 - rmse_dl: 1.1369 - val_loss: 1.3192 - val_rmse_dl: 1.1404
Epoch 275/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3127 - rmse_dl: 1.1382 - val_loss: 1.3206 - val_rmse_dl: 1.1413
Epoch 276/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3107 - rmse_dl: 1.1386 - val_loss: 1.3294 - val_rmse_dl: 1.1446
Epoch 277/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3077 - rmse_dl: 1.1369 - val_loss: 1.3167 - val_rmse_dl: 1.1399
Epoch 278/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3080 - rmse_dl: 1.1363 - val_loss: 1.3243 - val_rmse_dl: 1.1425
Epoch 279/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3110 - rmse_dl: 1.1377 - val_loss: 1.3155 - val_rmse_dl: 1.1396
Epoch 280/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3075 - rmse_dl: 1.1356 - val_loss: 1.3230 - val_rmse_dl: 1.1419
Epoch 281/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3059 - rmse_dl: 1.1349 - val_loss: 1.3219 - val_rmse_dl: 1.1420
Epoch 282/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3112 - rmse_dl: 1.1378 - val_loss: 1.3316 - val_rmse_dl: 1.1455
Epoch 283/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3080 - rmse_dl: 1.1361 - val_loss: 1.3509 - val_rmse_dl: 1.1529
Epoch 284/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3077 - rmse_dl: 1.1352 - val_loss: 1.3418 - val_rmse_dl: 1.1495
Epoch 285/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3076 - rmse_dl: 1.1358 - val_loss: 1.3098 - val_rmse_dl: 1.1372
Epoch 286/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3130 - rmse_dl: 1.1384 - val_loss: 1.3272 - val_rmse_dl: 1.1438
Epoch 287/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3088 - rmse_dl: 1.1370 - val_loss: 1.3200 - val_rmse_dl: 1.1409
Epoch 288/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3055 - rmse_dl: 1.1350 - val_loss: 1.3352 - val_rmse_dl: 1.1474
Epoch 289/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3072 - rmse_dl: 1.1349 - val_loss: 1.3341 - val_rmse_dl: 1.1470
Epoch 290/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3094 - rmse_dl: 1.1363 - val_loss: 1.3173 - val_rmse_dl: 1.1402
Epoch 291/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3082 - rmse_dl: 1.1361 - val_loss: 1.3223 - val_rmse_dl: 1.1420
Epoch 292/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3098 - rmse_dl: 1.1377 - val_loss: 1.3197 - val_rmse_dl: 1.1405
Epoch 293/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3107 - rmse_dl: 1.1379 - val_loss: 1.3196 - val_rmse_dl: 1.1410
Epoch 294/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3085 - rmse_dl: 1.1373 - val_loss: 1.3159 - val_rmse_dl: 1.1395
Epoch 295/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3074 - rmse_dl: 1.1362 - val_loss: 1.3208 - val_rmse_dl: 1.1415
Epoch 296/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3084 - rmse_dl: 1.1367 - val_loss: 1.3239 - val_rmse_dl: 1.1424
Epoch 297/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3112 - rmse_dl: 1.1373 - val_loss: 1.3180 - val_rmse_dl: 1.1403
Epoch 298/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3081 - rmse_dl: 1.1365 - val_loss: 1.3328 - val_rmse_dl: 1.1459
Epoch 299/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3073 - rmse_dl: 1.1364 - val_loss: 1.3186 - val_rmse_dl: 1.1405
Epoch 300/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3068 - rmse_dl: 1.1357 - val_loss: 1.3183 - val_rmse_dl: 1.1405
Epoch 301/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3078 - rmse_dl: 1.1360 - val_loss: 1.3186 - val_rmse_dl: 1.1410
Epoch 302/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3099 - rmse_dl: 1.1374 - val_loss: 1.3248 - val_rmse_dl: 1.1430
Epoch 303/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3096 - rmse_dl: 1.1369 - val_loss: 1.3282 - val_rmse_dl: 1.1442
Epoch 304/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3086 - rmse_dl: 1.1364 - val_loss: 1.3127 - val_rmse_dl: 1.1383
Epoch 305/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3061 - rmse_dl: 1.1355 - val_loss: 1.3262 - val_rmse_dl: 1.1438
Epoch 306/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3048 - rmse_dl: 1.1353 - val_loss: 1.3166 - val_rmse_dl: 1.1399
Epoch 307/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3074 - rmse_dl: 1.1361 - val_loss: 1.3400 - val_rmse_dl: 1.1487
Epoch 308/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3096 - rmse_dl: 1.1364 - val_loss: 1.3192 - val_rmse_dl: 1.1408
Epoch 309/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3082 - rmse_dl: 1.1370 - val_loss: 1.3157 - val_rmse_dl: 1.1397
Epoch 310/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3121 - rmse_dl: 1.1385 - val_loss: 1.3182 - val_rmse_dl: 1.1405
Epoch 311/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3068 - rmse_dl: 1.1355 - val_loss: 1.3498 - val_rmse_dl: 1.1533
Epoch 312/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3084 - rmse_dl: 1.1351 - val_loss: 1.3308 - val_rmse_dl: 1.1452
Epoch 313/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3091 - rmse_dl: 1.1364 - val_loss: 1.3484 - val_rmse_dl: 1.1526
Epoch 314/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3080 - rmse_dl: 1.1362 - val_loss: 1.3252 - val_rmse_dl: 1.1430
Epoch 315/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3056 - rmse_dl: 1.1348 - val_loss: 1.3306 - val_rmse_dl: 1.1454
Epoch 316/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3057 - rmse_dl: 1.1360 - val_loss: 1.3121 - val_rmse_dl: 1.1380
Epoch 317/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3087 - rmse_dl: 1.1360 - val_loss: 1.3175 - val_rmse_dl: 1.1404
Epoch 318/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3101 - rmse_dl: 1.1383 - val_loss: 1.3148 - val_rmse_dl: 1.1393
Epoch 319/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3085 - rmse_dl: 1.1363 - val_loss: 1.3347 - val_rmse_dl: 1.1471
Epoch 320/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3090 - rmse_dl: 1.1367 - val_loss: 1.3176 - val_rmse_dl: 1.1404
Epoch 321/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3060 - rmse_dl: 1.1339 - val_loss: 1.3247 - val_rmse_dl: 1.1434
Epoch 322/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3038 - rmse_dl: 1.1336 - val_loss: 1.3297 - val_rmse_dl: 1.1448
Epoch 323/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3069 - rmse_dl: 1.1348 - val_loss: 1.3347 - val_rmse_dl: 1.1472
Epoch 324/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3103 - rmse_dl: 1.1374 - val_loss: 1.3444 - val_rmse_dl: 1.1506
Epoch 325/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3075 - rmse_dl: 1.1363 - val_loss: 1.3300 - val_rmse_dl: 1.1449
Epoch 326/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3087 - rmse_dl: 1.1362 - val_loss: 1.3301 - val_rmse_dl: 1.1446
Epoch 327/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3073 - rmse_dl: 1.1364 - val_loss: 1.3153 - val_rmse_dl: 1.1401
Epoch 328/550
338/338 [==============================] - 2s 5ms/step - loss: 1.3076 - rmse_dl: 1.1366 - val_loss: 1.3184 - val_rmse_dl: 1.1405
Epoch 329/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3090 - rmse_dl: 1.1365 - val_loss: 1.3179 - val_rmse_dl: 1.1406
Epoch 330/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3046 - rmse_dl: 1.1341 - val_loss: 1.3364 - val_rmse_dl: 1.1477
Epoch 331/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3101 - rmse_dl: 1.1374 - val_loss: 1.3197 - val_rmse_dl: 1.1409
Epoch 332/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3076 - rmse_dl: 1.1365 - val_loss: 1.3180 - val_rmse_dl: 1.1405
Epoch 333/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3090 - rmse_dl: 1.1364 - val_loss: 1.3215 - val_rmse_dl: 1.1418
Epoch 334/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3048 - rmse_dl: 1.1356 - val_loss: 1.3369 - val_rmse_dl: 1.1476
Epoch 335/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3056 - rmse_dl: 1.1361 - val_loss: 1.3161 - val_rmse_dl: 1.1397
Epoch 336/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3084 - rmse_dl: 1.1360 - val_loss: 1.3247 - val_rmse_dl: 1.1430
Epoch 337/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3085 - rmse_dl: 1.1365 - val_loss: 1.3181 - val_rmse_dl: 1.1405
Epoch 338/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3080 - rmse_dl: 1.1355 - val_loss: 1.3203 - val_rmse_dl: 1.1413
Epoch 339/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3093 - rmse_dl: 1.1373 - val_loss: 1.3174 - val_rmse_dl: 1.1406
Epoch 340/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3052 - rmse_dl: 1.1350 - val_loss: 1.3345 - val_rmse_dl: 1.1472
Epoch 341/550
338/338 [==============================] - 2s 5ms/step - loss: 1.3099 - rmse_dl: 1.1365 - val_loss: 1.3161 - val_rmse_dl: 1.1397
Epoch 342/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3045 - rmse_dl: 1.1348 - val_loss: 1.3394 - val_rmse_dl: 1.1488
Epoch 343/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3096 - rmse_dl: 1.1368 - val_loss: 1.3245 - val_rmse_dl: 1.1426
Epoch 344/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3062 - rmse_dl: 1.1358 - val_loss: 1.3210 - val_rmse_dl: 1.1415
Epoch 345/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3136 - rmse_dl: 1.1392 - val_loss: 1.3228 - val_rmse_dl: 1.1419
Epoch 346/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3090 - rmse_dl: 1.1364 - val_loss: 1.3171 - val_rmse_dl: 1.1401
Epoch 347/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3109 - rmse_dl: 1.1376 - val_loss: 1.3187 - val_rmse_dl: 1.1400
Epoch 348/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3089 - rmse_dl: 1.1363 - val_loss: 1.3296 - val_rmse_dl: 1.1450
Epoch 349/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3047 - rmse_dl: 1.1349 - val_loss: 1.3232 - val_rmse_dl: 1.1422
Epoch 350/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3094 - rmse_dl: 1.1372 - val_loss: 1.3476 - val_rmse_dl: 1.1521
Epoch 351/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3096 - rmse_dl: 1.1362 - val_loss: 1.3480 - val_rmse_dl: 1.1520
Epoch 352/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3051 - rmse_dl: 1.1353 - val_loss: 1.3353 - val_rmse_dl: 1.1470
Epoch 353/550
338/338 [==============================] - ETA: 0s - loss: 1.3093 - rmse_dl: 1.136 - 1s 4ms/step - loss: 1.3109 - rmse_dl: 1.1380 - val_loss: 1.3221 - val_rmse_dl: 1.1420
Epoch 354/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3081 - rmse_dl: 1.1367 - val_loss: 1.3170 - val_rmse_dl: 1.1402
Epoch 355/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3090 - rmse_dl: 1.1373 - val_loss: 1.3383 - val_rmse_dl: 1.1485
Epoch 356/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3080 - rmse_dl: 1.1363 - val_loss: 1.3240 - val_rmse_dl: 1.1429
Epoch 357/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3095 - rmse_dl: 1.1365 - val_loss: 1.3130 - val_rmse_dl: 1.1388
Epoch 358/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3068 - rmse_dl: 1.1350 - val_loss: 1.3283 - val_rmse_dl: 1.1446
Epoch 359/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3089 - rmse_dl: 1.1371 - val_loss: 1.3175 - val_rmse_dl: 1.1403
Epoch 360/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3098 - rmse_dl: 1.1377 - val_loss: 1.3127 - val_rmse_dl: 1.1387
Epoch 361/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3044 - rmse_dl: 1.1349 - val_loss: 1.3181 - val_rmse_dl: 1.1411
Epoch 362/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3135 - rmse_dl: 1.1397 - val_loss: 1.3141 - val_rmse_dl: 1.1394
Epoch 363/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3074 - rmse_dl: 1.1369 - val_loss: 1.3227 - val_rmse_dl: 1.1426
Epoch 364/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3096 - rmse_dl: 1.1372 - val_loss: 1.3265 - val_rmse_dl: 1.1437
Epoch 365/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3067 - rmse_dl: 1.1368 - val_loss: 1.3173 - val_rmse_dl: 1.1399
Epoch 366/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3101 - rmse_dl: 1.1371 - val_loss: 1.3287 - val_rmse_dl: 1.1447
Epoch 367/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3059 - rmse_dl: 1.1363 - val_loss: 1.3159 - val_rmse_dl: 1.1401
Epoch 368/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3096 - rmse_dl: 1.1381 - val_loss: 1.3200 - val_rmse_dl: 1.1411
Epoch 369/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3089 - rmse_dl: 1.1370 - val_loss: 1.3328 - val_rmse_dl: 1.1464
Epoch 370/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3124 - rmse_dl: 1.1387 - val_loss: 1.3259 - val_rmse_dl: 1.1431
Epoch 371/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3068 - rmse_dl: 1.1362 - val_loss: 1.3307 - val_rmse_dl: 1.1458
Epoch 372/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3096 - rmse_dl: 1.1375 - val_loss: 1.3120 - val_rmse_dl: 1.1384
Epoch 373/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3080 - rmse_dl: 1.1365 - val_loss: 1.3196 - val_rmse_dl: 1.1410
Epoch 374/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3075 - rmse_dl: 1.1361 - val_loss: 1.3397 - val_rmse_dl: 1.1491
Epoch 375/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3060 - rmse_dl: 1.1356 - val_loss: 1.3246 - val_rmse_dl: 1.1429
Epoch 376/550
338/338 [==============================] - ETA: 0s - loss: 1.3072 - rmse_dl: 1.136 - 1s 4ms/step - loss: 1.3063 - rmse_dl: 1.1358 - val_loss: 1.3133 - val_rmse_dl: 1.1388
Epoch 377/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3050 - rmse_dl: 1.1359 - val_loss: 1.3214 - val_rmse_dl: 1.1415
Epoch 378/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3057 - rmse_dl: 1.1338 - val_loss: 1.3180 - val_rmse_dl: 1.1405
Epoch 379/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3054 - rmse_dl: 1.1352 - val_loss: 1.3274 - val_rmse_dl: 1.1436
Epoch 380/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3047 - rmse_dl: 1.1349 - val_loss: 1.3346 - val_rmse_dl: 1.1475
Epoch 381/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3076 - rmse_dl: 1.1360 - val_loss: 1.3191 - val_rmse_dl: 1.1410
Epoch 382/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3091 - rmse_dl: 1.1375 - val_loss: 1.3141 - val_rmse_dl: 1.1390
Epoch 383/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3069 - rmse_dl: 1.1365 - val_loss: 1.3210 - val_rmse_dl: 1.1421
Epoch 384/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3071 - rmse_dl: 1.1366 - val_loss: 1.3282 - val_rmse_dl: 1.1448
Epoch 385/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3114 - rmse_dl: 1.1388 - val_loss: 1.3146 - val_rmse_dl: 1.1391
Epoch 386/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3091 - rmse_dl: 1.1359 - val_loss: 1.3192 - val_rmse_dl: 1.1408
Epoch 387/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3117 - rmse_dl: 1.1384 - val_loss: 1.3377 - val_rmse_dl: 1.1481
Epoch 388/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3082 - rmse_dl: 1.1362 - val_loss: 1.3290 - val_rmse_dl: 1.1448
Epoch 389/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3120 - rmse_dl: 1.1391 - val_loss: 1.3130 - val_rmse_dl: 1.1393
Epoch 390/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3075 - rmse_dl: 1.1377 - val_loss: 1.3289 - val_rmse_dl: 1.1445
Epoch 391/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3068 - rmse_dl: 1.1362 - val_loss: 1.3269 - val_rmse_dl: 1.1443
Epoch 392/550
338/338 [==============================] - ETA: 0s - loss: 1.3152 - rmse_dl: 1.139 - 1s 4ms/step - loss: 1.3152 - rmse_dl: 1.1397 - val_loss: 1.3297 - val_rmse_dl: 1.1451
Epoch 393/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3099 - rmse_dl: 1.1379 - val_loss: 1.3242 - val_rmse_dl: 1.1429
Epoch 394/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3063 - rmse_dl: 1.1355 - val_loss: 1.3233 - val_rmse_dl: 1.1428
Epoch 395/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3100 - rmse_dl: 1.1383 - val_loss: 1.3136 - val_rmse_dl: 1.1388
Epoch 396/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3077 - rmse_dl: 1.1361 - val_loss: 1.3230 - val_rmse_dl: 1.1427
Epoch 397/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3057 - rmse_dl: 1.1356 - val_loss: 1.3291 - val_rmse_dl: 1.1445
Epoch 398/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3087 - rmse_dl: 1.1372 - val_loss: 1.3192 - val_rmse_dl: 1.1415
Epoch 399/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3112 - rmse_dl: 1.1373 - val_loss: 1.3247 - val_rmse_dl: 1.1431
Epoch 400/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3066 - rmse_dl: 1.1353 - val_loss: 1.3267 - val_rmse_dl: 1.1441
Epoch 401/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3067 - rmse_dl: 1.1367 - val_loss: 1.3235 - val_rmse_dl: 1.1425
Epoch 402/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3115 - rmse_dl: 1.1376 - val_loss: 1.3330 - val_rmse_dl: 1.1462
Epoch 403/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3082 - rmse_dl: 1.1370 - val_loss: 1.3352 - val_rmse_dl: 1.1473
Epoch 404/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3100 - rmse_dl: 1.1380 - val_loss: 1.3355 - val_rmse_dl: 1.1475
Epoch 405/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3120 - rmse_dl: 1.1389 - val_loss: 1.3313 - val_rmse_dl: 1.1460
Epoch 406/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3052 - rmse_dl: 1.1343 - val_loss: 1.3185 - val_rmse_dl: 1.1410
Epoch 407/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3109 - rmse_dl: 1.1373 - val_loss: 1.3171 - val_rmse_dl: 1.1406
Epoch 408/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3069 - rmse_dl: 1.1361 - val_loss: 1.3173 - val_rmse_dl: 1.1406
Epoch 409/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3083 - rmse_dl: 1.1374 - val_loss: 1.3161 - val_rmse_dl: 1.1401
Epoch 410/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3109 - rmse_dl: 1.1384 - val_loss: 1.3121 - val_rmse_dl: 1.1382
Epoch 411/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3076 - rmse_dl: 1.1361 - val_loss: 1.3345 - val_rmse_dl: 1.1468
Epoch 412/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3088 - rmse_dl: 1.1357 - val_loss: 1.3177 - val_rmse_dl: 1.1399
Epoch 413/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3092 - rmse_dl: 1.1370 - val_loss: 1.3106 - val_rmse_dl: 1.1375
Epoch 414/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3092 - rmse_dl: 1.1361 - val_loss: 1.3220 - val_rmse_dl: 1.1414
Epoch 415/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3078 - rmse_dl: 1.1350 - val_loss: 1.3220 - val_rmse_dl: 1.1423
Epoch 416/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3039 - rmse_dl: 1.1347 - val_loss: 1.3474 - val_rmse_dl: 1.1521
Epoch 417/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3106 - rmse_dl: 1.1378 - val_loss: 1.3250 - val_rmse_dl: 1.1434
Epoch 418/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3069 - rmse_dl: 1.1352 - val_loss: 1.3166 - val_rmse_dl: 1.1403
Epoch 419/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3072 - rmse_dl: 1.1356 - val_loss: 1.3351 - val_rmse_dl: 1.1467
Epoch 420/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3097 - rmse_dl: 1.1368 - val_loss: 1.3263 - val_rmse_dl: 1.1443
Epoch 421/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3066 - rmse_dl: 1.1356 - val_loss: 1.3420 - val_rmse_dl: 1.1501
Epoch 422/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3098 - rmse_dl: 1.1378 - val_loss: 1.3209 - val_rmse_dl: 1.1417
Epoch 423/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3042 - rmse_dl: 1.1356 - val_loss: 1.3352 - val_rmse_dl: 1.1478
Epoch 424/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3101 - rmse_dl: 1.1371 - val_loss: 1.3429 - val_rmse_dl: 1.1508
Epoch 425/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3097 - rmse_dl: 1.1371 - val_loss: 1.3188 - val_rmse_dl: 1.1407
Epoch 426/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3048 - rmse_dl: 1.1353 - val_loss: 1.3228 - val_rmse_dl: 1.1424
Epoch 427/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3051 - rmse_dl: 1.1346 - val_loss: 1.3394 - val_rmse_dl: 1.1489
Epoch 428/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3076 - rmse_dl: 1.1365 - val_loss: 1.3267 - val_rmse_dl: 1.1438
Epoch 429/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3096 - rmse_dl: 1.1366 - val_loss: 1.3251 - val_rmse_dl: 1.1433
Epoch 430/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3078 - rmse_dl: 1.1363 - val_loss: 1.3260 - val_rmse_dl: 1.1434
Epoch 431/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3094 - rmse_dl: 1.1378 - val_loss: 1.3152 - val_rmse_dl: 1.1396
Epoch 432/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3068 - rmse_dl: 1.1357 - val_loss: 1.3437 - val_rmse_dl: 1.1507
Epoch 433/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3096 - rmse_dl: 1.1364 - val_loss: 1.3193 - val_rmse_dl: 1.1413
Epoch 434/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3116 - rmse_dl: 1.1389 - val_loss: 1.3233 - val_rmse_dl: 1.1432
Epoch 435/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3050 - rmse_dl: 1.1356 - val_loss: 1.3351 - val_rmse_dl: 1.1476
Epoch 436/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3127 - rmse_dl: 1.1384 - val_loss: 1.3170 - val_rmse_dl: 1.1398
Epoch 437/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3093 - rmse_dl: 1.1363 - val_loss: 1.3188 - val_rmse_dl: 1.1412
Epoch 438/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3052 - rmse_dl: 1.1352 - val_loss: 1.3153 - val_rmse_dl: 1.1394
Epoch 439/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3108 - rmse_dl: 1.1375 - val_loss: 1.3219 - val_rmse_dl: 1.1422
Epoch 440/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3048 - rmse_dl: 1.1347 - val_loss: 1.3177 - val_rmse_dl: 1.1405
Epoch 441/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3122 - rmse_dl: 1.1380 - val_loss: 1.3307 - val_rmse_dl: 1.1457
Epoch 442/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3056 - rmse_dl: 1.1351 - val_loss: 1.3140 - val_rmse_dl: 1.1392
Epoch 443/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3114 - rmse_dl: 1.1381 - val_loss: 1.3227 - val_rmse_dl: 1.1421
Epoch 444/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3089 - rmse_dl: 1.1371 - val_loss: 1.3349 - val_rmse_dl: 1.1469
Epoch 445/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3072 - rmse_dl: 1.1364 - val_loss: 1.3244 - val_rmse_dl: 1.1430
Epoch 446/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3031 - rmse_dl: 1.1350 - val_loss: 1.3330 - val_rmse_dl: 1.1460
Epoch 447/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3107 - rmse_dl: 1.1374 - val_loss: 1.3315 - val_rmse_dl: 1.1457
Epoch 448/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3065 - rmse_dl: 1.1358 - val_loss: 1.3294 - val_rmse_dl: 1.1451
Epoch 449/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3086 - rmse_dl: 1.1377 - val_loss: 1.3288 - val_rmse_dl: 1.1450
Epoch 450/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3105 - rmse_dl: 1.1377 - val_loss: 1.3199 - val_rmse_dl: 1.1413
Epoch 451/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3066 - rmse_dl: 1.1362 - val_loss: 1.3166 - val_rmse_dl: 1.1396
Epoch 452/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3070 - rmse_dl: 1.1369 - val_loss: 1.3190 - val_rmse_dl: 1.1417
Epoch 453/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3083 - rmse_dl: 1.1369 - val_loss: 1.3229 - val_rmse_dl: 1.1420
Epoch 454/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3060 - rmse_dl: 1.1357 - val_loss: 1.3216 - val_rmse_dl: 1.1421
Epoch 455/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3099 - rmse_dl: 1.1375 - val_loss: 1.3278 - val_rmse_dl: 1.1445
Epoch 456/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3086 - rmse_dl: 1.1362 - val_loss: 1.3121 - val_rmse_dl: 1.1382
Epoch 457/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3081 - rmse_dl: 1.1365 - val_loss: 1.3238 - val_rmse_dl: 1.1428
Epoch 458/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3069 - rmse_dl: 1.1358 - val_loss: 1.3328 - val_rmse_dl: 1.1462
Epoch 459/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3068 - rmse_dl: 1.1357 - val_loss: 1.3182 - val_rmse_dl: 1.1407
Epoch 460/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3100 - rmse_dl: 1.1365 - val_loss: 1.3214 - val_rmse_dl: 1.1420
Epoch 461/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3049 - rmse_dl: 1.1356 - val_loss: 1.3149 - val_rmse_dl: 1.1396
Epoch 462/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3091 - rmse_dl: 1.1369 - val_loss: 1.3291 - val_rmse_dl: 1.1453
Epoch 463/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3065 - rmse_dl: 1.1355 - val_loss: 1.3289 - val_rmse_dl: 1.1451
Epoch 464/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3102 - rmse_dl: 1.1382 - val_loss: 1.3214 - val_rmse_dl: 1.1420
Epoch 465/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3066 - rmse_dl: 1.1360 - val_loss: 1.3255 - val_rmse_dl: 1.1434
Epoch 466/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3076 - rmse_dl: 1.1363 - val_loss: 1.3148 - val_rmse_dl: 1.1394
Epoch 467/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3096 - rmse_dl: 1.1372 - val_loss: 1.3222 - val_rmse_dl: 1.1424
Epoch 468/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3106 - rmse_dl: 1.1373 - val_loss: 1.3248 - val_rmse_dl: 1.1437
Epoch 469/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3063 - rmse_dl: 1.1354 - val_loss: 1.3328 - val_rmse_dl: 1.1469
Epoch 470/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3113 - rmse_dl: 1.1384 - val_loss: 1.3130 - val_rmse_dl: 1.1392
Epoch 471/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3094 - rmse_dl: 1.1372 - val_loss: 1.3186 - val_rmse_dl: 1.1412
Epoch 472/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3048 - rmse_dl: 1.1354 - val_loss: 1.3248 - val_rmse_dl: 1.1434
Epoch 473/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3089 - rmse_dl: 1.1375 - val_loss: 1.3296 - val_rmse_dl: 1.1454
Epoch 474/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3093 - rmse_dl: 1.1376 - val_loss: 1.3502 - val_rmse_dl: 1.1537
Epoch 475/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3065 - rmse_dl: 1.1357 - val_loss: 1.3252 - val_rmse_dl: 1.1439
Epoch 476/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3092 - rmse_dl: 1.1375 - val_loss: 1.3192 - val_rmse_dl: 1.1412
Epoch 477/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3081 - rmse_dl: 1.1374 - val_loss: 1.3154 - val_rmse_dl: 1.1396
Epoch 478/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3099 - rmse_dl: 1.1369 - val_loss: 1.3159 - val_rmse_dl: 1.1397
Epoch 479/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3076 - rmse_dl: 1.1359 - val_loss: 1.3301 - val_rmse_dl: 1.1453
Epoch 480/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3068 - rmse_dl: 1.1366 - val_loss: 1.3168 - val_rmse_dl: 1.1401
Epoch 481/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3080 - rmse_dl: 1.1361 - val_loss: 1.3642 - val_rmse_dl: 1.1590
Epoch 482/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3066 - rmse_dl: 1.1364 - val_loss: 1.3205 - val_rmse_dl: 1.1414
Epoch 483/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3051 - rmse_dl: 1.1355 - val_loss: 1.3221 - val_rmse_dl: 1.1421
Epoch 484/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3077 - rmse_dl: 1.1367 - val_loss: 1.3306 - val_rmse_dl: 1.1454
Epoch 485/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3042 - rmse_dl: 1.1354 - val_loss: 1.3232 - val_rmse_dl: 1.1430
Epoch 486/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3119 - rmse_dl: 1.1381 - val_loss: 1.3129 - val_rmse_dl: 1.1387
Epoch 487/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3061 - rmse_dl: 1.1364 - val_loss: 1.3220 - val_rmse_dl: 1.1422
Epoch 488/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3107 - rmse_dl: 1.1371 - val_loss: 1.3174 - val_rmse_dl: 1.1401
Epoch 489/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3052 - rmse_dl: 1.1350 - val_loss: 1.3355 - val_rmse_dl: 1.1471
Epoch 490/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3058 - rmse_dl: 1.1356 - val_loss: 1.3166 - val_rmse_dl: 1.1404
Epoch 491/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3066 - rmse_dl: 1.1364 - val_loss: 1.3184 - val_rmse_dl: 1.1409
Epoch 492/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3104 - rmse_dl: 1.1381 - val_loss: 1.3234 - val_rmse_dl: 1.1430
Epoch 493/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3105 - rmse_dl: 1.1375 - val_loss: 1.3179 - val_rmse_dl: 1.1408
Epoch 494/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3074 - rmse_dl: 1.1357 - val_loss: 1.3198 - val_rmse_dl: 1.1413
Epoch 495/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3062 - rmse_dl: 1.1358 - val_loss: 1.3243 - val_rmse_dl: 1.1432
Epoch 496/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3113 - rmse_dl: 1.1385 - val_loss: 1.3131 - val_rmse_dl: 1.1390
Epoch 497/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3111 - rmse_dl: 1.1378 - val_loss: 1.3212 - val_rmse_dl: 1.1415
Epoch 498/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3112 - rmse_dl: 1.1386 - val_loss: 1.3158 - val_rmse_dl: 1.1396
Epoch 499/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3082 - rmse_dl: 1.1366 - val_loss: 1.3165 - val_rmse_dl: 1.1405
Epoch 500/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3088 - rmse_dl: 1.1366 - val_loss: 1.3208 - val_rmse_dl: 1.1420
Epoch 501/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3068 - rmse_dl: 1.1354 - val_loss: 1.3384 - val_rmse_dl: 1.1489
Epoch 502/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3088 - rmse_dl: 1.1375 - val_loss: 1.3277 - val_rmse_dl: 1.1443
Epoch 503/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3060 - rmse_dl: 1.1362 - val_loss: 1.3174 - val_rmse_dl: 1.1404
Epoch 504/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3059 - rmse_dl: 1.1350 - val_loss: 1.3256 - val_rmse_dl: 1.1436
Epoch 505/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3095 - rmse_dl: 1.1379 - val_loss: 1.3170 - val_rmse_dl: 1.1403
Epoch 506/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3101 - rmse_dl: 1.1376 - val_loss: 1.3388 - val_rmse_dl: 1.1489
Epoch 507/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3053 - rmse_dl: 1.1350 - val_loss: 1.3134 - val_rmse_dl: 1.1394
Epoch 508/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3062 - rmse_dl: 1.1370 - val_loss: 1.3192 - val_rmse_dl: 1.1411
Epoch 509/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3021 - rmse_dl: 1.1342 - val_loss: 1.3243 - val_rmse_dl: 1.1434
Epoch 510/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3066 - rmse_dl: 1.1356 - val_loss: 1.3165 - val_rmse_dl: 1.1400
Epoch 511/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3059 - rmse_dl: 1.1352 - val_loss: 1.3345 - val_rmse_dl: 1.1477
Epoch 512/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3091 - rmse_dl: 1.1362 - val_loss: 1.3327 - val_rmse_dl: 1.1472
Epoch 513/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3128 - rmse_dl: 1.1387 - val_loss: 1.3118 - val_rmse_dl: 1.1380
Epoch 514/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3020 - rmse_dl: 1.1336 - val_loss: 1.3297 - val_rmse_dl: 1.1452
Epoch 515/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3027 - rmse_dl: 1.1348 - val_loss: 1.3870 - val_rmse_dl: 1.1682
Epoch 516/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3072 - rmse_dl: 1.1367 - val_loss: 1.3255 - val_rmse_dl: 1.1435
Epoch 517/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3049 - rmse_dl: 1.1352 - val_loss: 1.3412 - val_rmse_dl: 1.1503
Epoch 518/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3076 - rmse_dl: 1.1370 - val_loss: 1.3218 - val_rmse_dl: 1.1426
Epoch 519/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3092 - rmse_dl: 1.1372 - val_loss: 1.3130 - val_rmse_dl: 1.1391
Epoch 520/550
338/338 [==============================] - 1s 3ms/step - loss: 1.3071 - rmse_dl: 1.1365 - val_loss: 1.3245 - val_rmse_dl: 1.1430
Epoch 521/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3056 - rmse_dl: 1.1349 - val_loss: 1.3130 - val_rmse_dl: 1.1393
Epoch 522/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3070 - rmse_dl: 1.1356 - val_loss: 1.3192 - val_rmse_dl: 1.1411
Epoch 523/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3086 - rmse_dl: 1.1365 - val_loss: 1.3176 - val_rmse_dl: 1.1407
Epoch 524/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3055 - rmse_dl: 1.1361 - val_loss: 1.3214 - val_rmse_dl: 1.1419
Epoch 525/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3059 - rmse_dl: 1.1359 - val_loss: 1.3327 - val_rmse_dl: 1.1463
Epoch 526/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3104 - rmse_dl: 1.1380 - val_loss: 1.3174 - val_rmse_dl: 1.1402
Epoch 527/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3083 - rmse_dl: 1.1369 - val_loss: 1.3225 - val_rmse_dl: 1.1422
Epoch 528/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3046 - rmse_dl: 1.1351 - val_loss: 1.3398 - val_rmse_dl: 1.1491
Epoch 529/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3103 - rmse_dl: 1.1379 - val_loss: 1.3199 - val_rmse_dl: 1.1415
Epoch 530/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3062 - rmse_dl: 1.1365 - val_loss: 1.3259 - val_rmse_dl: 1.1434
Epoch 531/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3075 - rmse_dl: 1.1364 - val_loss: 1.3216 - val_rmse_dl: 1.1425
Epoch 532/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3083 - rmse_dl: 1.1369 - val_loss: 1.3248 - val_rmse_dl: 1.1434
Epoch 533/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3061 - rmse_dl: 1.1347 - val_loss: 1.3265 - val_rmse_dl: 1.1440
Epoch 534/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3086 - rmse_dl: 1.1368 - val_loss: 1.3191 - val_rmse_dl: 1.1412
Epoch 535/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3051 - rmse_dl: 1.1356 - val_loss: 1.3230 - val_rmse_dl: 1.1427
Epoch 536/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3074 - rmse_dl: 1.1358 - val_loss: 1.3142 - val_rmse_dl: 1.1395
Epoch 537/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3041 - rmse_dl: 1.1343 - val_loss: 1.3166 - val_rmse_dl: 1.1400
Epoch 538/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3066 - rmse_dl: 1.1362 - val_loss: 1.3281 - val_rmse_dl: 1.1447
Epoch 539/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3075 - rmse_dl: 1.1366 - val_loss: 1.3154 - val_rmse_dl: 1.1393
Epoch 540/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3055 - rmse_dl: 1.1353 - val_loss: 1.3210 - val_rmse_dl: 1.1417
Epoch 541/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3065 - rmse_dl: 1.1355 - val_loss: 1.3138 - val_rmse_dl: 1.1395
Epoch 542/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3044 - rmse_dl: 1.1358 - val_loss: 1.3179 - val_rmse_dl: 1.1412
Epoch 543/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3078 - rmse_dl: 1.1368 - val_loss: 1.3209 - val_rmse_dl: 1.1420
Epoch 544/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3073 - rmse_dl: 1.1368 - val_loss: 1.3166 - val_rmse_dl: 1.1403
Epoch 545/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3042 - rmse_dl: 1.1349 - val_loss: 1.3260 - val_rmse_dl: 1.1442
Epoch 546/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3054 - rmse_dl: 1.1356 - val_loss: 1.3119 - val_rmse_dl: 1.1387
Epoch 547/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3066 - rmse_dl: 1.1368 - val_loss: 1.3258 - val_rmse_dl: 1.1434
Epoch 548/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3111 - rmse_dl: 1.1384 - val_loss: 1.3253 - val_rmse_dl: 1.1435
Epoch 549/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3043 - rmse_dl: 1.1349 - val_loss: 1.3319 - val_rmse_dl: 1.1465
Epoch 550/550
338/338 [==============================] - 1s 4ms/step - loss: 1.3058 - rmse_dl: 1.1350 - val_loss: 1.3272 - val_rmse_dl: 1.1440
Epoch 1/25
  1/338 [..............................] - ETA: 0s - loss: 1.3389 - rmse_dl: 1.1543WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.0000s vs `on_train_batch_end` time: 0.0157s). Check your callbacks.
338/338 [==============================] - 3s 10ms/step - loss: 1.3043 - rmse_dl: 1.1362 - val_loss: 1.3135 - val_rmse_dl: 1.1391
Epoch 2/25
338/338 [==============================] - 1s 4ms/step - loss: 1.3077 - rmse_dl: 1.1372 - val_loss: 1.3138 - val_rmse_dl: 1.1392
Epoch 3/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3048 - rmse_dl: 1.1355 - val_loss: 1.3137 - val_rmse_dl: 1.1392
Epoch 4/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3088 - rmse_dl: 1.1375 - val_loss: 1.3140 - val_rmse_dl: 1.1393
Epoch 5/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3019 - rmse_dl: 1.1340 - val_loss: 1.3143 - val_rmse_dl: 1.1394
Epoch 6/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3050 - rmse_dl: 1.1352 - val_loss: 1.3142 - val_rmse_dl: 1.1394
Epoch 7/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3084 - rmse_dl: 1.1374 - val_loss: 1.3136 - val_rmse_dl: 1.1392
Epoch 8/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3059 - rmse_dl: 1.1352 - val_loss: 1.3141 - val_rmse_dl: 1.1394
Epoch 9/25
338/338 [==============================] - 1s 4ms/step - loss: 1.3004 - rmse_dl: 1.1336 - val_loss: 1.3140 - val_rmse_dl: 1.1393
Epoch 10/25
338/338 [==============================] - 1s 4ms/step - loss: 1.3034 - rmse_dl: 1.1350 - val_loss: 1.3135 - val_rmse_dl: 1.1391
Epoch 11/25
338/338 [==============================] - 1s 4ms/step - loss: 1.3038 - rmse_dl: 1.1352 - val_loss: 1.3137 - val_rmse_dl: 1.1392
Epoch 12/25
338/338 [==============================] - 1s 4ms/step - loss: 1.3056 - rmse_dl: 1.1365 - val_loss: 1.3142 - val_rmse_dl: 1.1394
Epoch 13/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3054 - rmse_dl: 1.1358 - val_loss: 1.3140 - val_rmse_dl: 1.1394
Epoch 14/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3014 - rmse_dl: 1.1352 - val_loss: 1.3144 - val_rmse_dl: 1.1395
Epoch 15/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3081 - rmse_dl: 1.1367 - val_loss: 1.3133 - val_rmse_dl: 1.1391
Epoch 16/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3031 - rmse_dl: 1.1343 - val_loss: 1.3133 - val_rmse_dl: 1.1391
Epoch 17/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3058 - rmse_dl: 1.1358 - val_loss: 1.3142 - val_rmse_dl: 1.1394
Epoch 18/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3066 - rmse_dl: 1.1371 - val_loss: 1.3136 - val_rmse_dl: 1.1392
Epoch 19/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3045 - rmse_dl: 1.1351 - val_loss: 1.3144 - val_rmse_dl: 1.1395
Epoch 20/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3032 - rmse_dl: 1.1351 - val_loss: 1.3142 - val_rmse_dl: 1.1394
Epoch 21/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3089 - rmse_dl: 1.1377 - val_loss: 1.3139 - val_rmse_dl: 1.1393
Epoch 22/25
338/338 [==============================] - 1s 3ms/step - loss: 1.3033 - rmse_dl: 1.1351 - val_loss: 1.3148 - val_rmse_dl: 1.1397
Epoch 23/25
338/338 [==============================] - 1s 4ms/step - loss: 1.3009 - rmse_dl: 1.1340 - val_loss: 1.3146 - val_rmse_dl: 1.1396
Epoch 24/25
338/338 [==============================] - 1s 4ms/step - loss: 1.3058 - rmse_dl: 1.1366 - val_loss: 1.3138 - val_rmse_dl: 1.1393
Epoch 25/25
338/338 [==============================] - 1s 4ms/step - loss: 1.3076 - rmse_dl: 1.1367 - val_loss: 1.3140 - val_rmse_dl: 1.1394
RMSE Neural Network log final: 1.146
RMSE Neural Network final: 22.539
In [ ]: